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Description 

FIELD OF THE INVENTION 

5 [0001] The present Invention relates to a method of producing proteins, polypeptides or peptides comprising admin- 
istering a composition comprising a transposon-based vector to an oviduct or an ovary of an animal. Such administration 
results In Incorporation of a gene of Interest contained in the vector in the ovary, the oviduct or an ovum of the animal. 
The present invention further includes production of a protein encoded by the gene In an egg produced by the animal. 

10 BACKGROUND OF THE INVENTION 

[0002] Transgenic animals are desirable for a variety of reasons, including their potential as biological factories to 
produce desired molecules for pharmaceutical, diagnostic and industrial uses. This potential is attractive to the industry 
due to the Inadequate capacity In facilities used for recombinant production of desired molecules and the increasing 
15 demand by the pharmaceutical industry for use of these facilities. Numerous attempts to produce transgenic animals 

have met several problems, including low rates of gene incorporation and unstable gene incorporation. Accordingly, 
Improved gene technologies are needed for the development of transgenic animals for the production of desired mole- 
cules. 

[0003] Improved gene delivery technologies are also needed for the treatment of disease in animals and humans. 
20 Many diseases and conditions can be treated with gene-delivery technologies, which provide a gene of Interest to a 

patient suffering from the disease or the condition. An example of such disease is Type 1 diabetes. Type 1 diabetes Is 
an autoimmune disease that ultimately results in destruction of the insulin producing |3-cells in the pancreas. Although 
patients with Type 1 diabetes may be treated adequately with insulin injections or insulin pumps, these therapies are 
only partially effective. Insulin replacement, such as via insulin injection or pump administration, cannot fully reverse the 

25 defect in the vascular endothelium found In the hyperglycemic state (Pieper et al., 1996. Diabetes Res. Clin. Pract. 
Suppl. S157-S162). In addition, hyper- and hypoglycemia occurs frequently despite intensive home blood glucose mon- 
itoring. Finally, careful dietary constraints are needed to maintain an adequate ratio of calories consumed. This often 
causes major psychosocial stress for many diabetic patients. Development of gene therapies providing delivery of the 
Insulin gene Into the pancreas of diabetic patients could overcome many of these problems and result in improved life 

30 expectancy and quality of life. 

[0004] Several of the prior art gene delivery technologies employed viruses that are associated with potentially unde- 
sirable side effects and safety concerns. The majority of current gene-delivery technologies useful for gene therapy reply 
on virus-based delivery vectors, such as adeno and adeno-associated viruses, retroviruses, and other viruses, which 
have been attenuated to no longer replicate. (Kay, M.A., et al. 2001. Nature Medicine 7:33-40). 

35 [0005] There are multiple problems associated with the useof viral vectors. Firstly, they are not tissue-specific. In fact, 
a gene therapy trial using adenovirus was recently halted because the vector was present in the patient's sperm (Gene 
trial to proceed despite fears that therapy could change child's genetic makeup. The New York Times, December 23, 
2001 ). Secondly, viral vectors are likely to be transiently incorporated, which necessitates re-treating a patient at specified 
time intervals. (Kay, M.A., et al. 2001. Nature Medicine 7:33-40). Thirdly, there is a concern that a viral-based vector 

40 could revertto Its virulentfonnandcause disease. Fourthly, viral-based vectors require adivlding cell forstable integration. 
Fifthly, viral-based vectors indiscriminately integrate into various cells, which can result in undesirable germline integra- 
tion. Sixthly, the required high titers needed to achieve the desired effect have resulted in the death of one patient and 
they are believed to be responsible for induction of cancer in a separate study. (Science, News of the Week, October 
4, 2002). 

45 [0006] Accordingly, what is needed is a new method to produce transgenic animals and humans with stably incorporated 
genes, in which the vector containing those genes does not cause disease or other unwanted side effects. There is also 
a need for DNA constructs that would be stably incorporated Into the tissues and cells of animals and humans, Including 
cells in the resting stats that are not replicating. There Is a further recognized need In the art for DNA constructs capable 
of delivering genes to specific tissues and cells of animals and humans. 

50 [0007] When incorporating a gene of interest into an animal for the production of a desired protein or when incorporating 
a gene of interest in an animal or human for the treatment of a disease, it is often desirable to selectively activate 
incorporated genes using Inducible promoters. These inducible promoters are regulated by substances either produced 
or recognized by the transcription control elements within the cell in which the gene is incorporated. In many instances, 
control of gene expression is desired in transgenic animals or humans so that incorporated genes are selectively activated 

55 at desired times and/or under the influence of specific substances. Accordingly, what is needed is a means to selectively 
activate genes introduced into the genome of cells of a transgenic animal or human. This can be taken a step further to 
cause incorporation to be tissue-specific, which prevents wide-spread gene incorporation throughout a patient's body 
(animal or human). This decreases the amount of DNA needed for a treatment, decreases the chance of incorporation 
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in gametes, and targets gene delivery, incorporation, and expression to the desired tissue wtiere ttie gene is needed to 
function. What is also needed is a rapid expression method for rapidly producing a protein or peptide of interest in eggs 
and mill< of transgenic animals. 

5 SUMMARY OF THE INVENTION 

[0008] The present invention addresses the problems described above by providing a method according to claim 1 . 
[0009] Animals are made transgenic through administration of a composition comprising a transposon-based vector 
designed for incorporation of a gene of interest for production of a desired protein, together with an acceptable carrier. 

'0 The compositions used according to the present invention are introduced into an oviduct or an ovary of a bird. The 
compositions used according to the present invention may be administered to a reproductive organ of an animal through 
the cloaca. The compositions used according to the present invention may be directly administered to an oviduct or an 
ovary or can be administered to an artery leading to an oviduct or an ovary. A transfection reagent is optionally added 
to the composition before administration. 

15 [0010] The transposon-based vectors of the presentinventionincludeatransposase,operably-linkedtoafirst promoter, 
and a coding sequence for a protein or peptide of interest operably-linl<ed to a second promoter, wherein the coding 
sequence for the protein or peptide of interest and its operably-linl<ed promoter are flanked by transposase insertion 
sequences recognized by the transposase and wherein the first promoter comprises a modified Kozak sequence com- 
prising ACC ATG (SEQ ID N0:1). The transposon-based vector also includes the following characteristics: a) one or 

20 more modified Kozak sequences at the 3' end of the first promoter to enhance expression of the transposase; b) mod- 
ifications of the codons for the first several N-terminal amino acids of the transposase, wherein the nucleotide at the 
third base position of each codon is changed to an A or a T without changing the corresponding amino acid; c) addition 
of one or more stop codons to enhance the termination of transposase synthesis; and/or, d) addition of an effective 
polyA sequence operably linked to the transposase to further enhance expression of the transposase gene. In some 

26 embodiments, the effective polyA sequence is an avian optimized polyA sequence. 

[0011] The present invention also provides for tissue-specific incorporation and/or expression of a gene of interest. 
Tissue-specific incorporation of a gene of interest , may be achieved by placing the transposase gene under the control 
of a tissue-specific promoter, whereas tissue-specific expression of a gene of interest may be achieved by placing the 
gene of interest under the control of a tissue-specific promoter. In some embodiments, the gene of interest is transcribed 

30 underthe influence of an ovalbumin, or other oviduct specific, promoter. Linking the gene of interestto an oviduct specific 
promoter in an egg-laying animal results in synthesis of a desired molecule and deposition of the desired molecule in a 
developing egg. 

[0012] The present invention advantageously produces a high number of transgenic animals having a gene of interest 
stably incorporated. In some embodiments wherein the transposon-based vector is administered to the ovary, these 

35 transgenic animals successfully pass the desired gene to their progeny. Accordingly, the present invention can be used 
to obtain transgenic animals having the gene of interest incorporated into the germline through transfection of the ovary 
or the present invention can be used to obtain transgenic animals having the gene of interest Incorporated into the 
oviduct in a tissue-specific manner, Both types of transgenic animals of the present invention produce large amounts of 
a desired molecule encoded by the transgene. Transgenic egg-laying animals, particularly avians, produce large amounts 

40 of a desired protein that is deposited in the egg for rapid harvest and purification. 

[0013] Any desired gene may be incorporated into the novel transposon-based vectors of the present invention in 
order to synthesize a desired molecule in the transgenic animals. Proteins, peptides and nucleic acids are the desired 
molecules to be produced by the transgenic animals of the present invention. Particularly preferred proteins are antibody 
proteins and other immunopharmaceutical proteins. 

45 [0014] This invention provides the use of a composition useful for the production of transgenic hens capable of pro- 
ducing substantially high amounts of a desired protein or peptide. Entire flocks of transgenic birds may be developed 
very, quickly in order to produce industrial amounts of desired molecules. The present invention solves the problems 
inherent in the Inadequate capacity of fermentation facilities used for bacterial production of molecules and provides a 
more efficient and economical way to produce desired molecules. Accordingly, the present invention provides a means 

50 to produce large amounts of therapeutic, diagnostic and reagent molecules. 

[0015] Transgenic chickens are excellent in terms of convenience and efficiency of manufacturing molecules such as 
proteins and peptides. Starting with a single transgenic rooster, thousands of transgenic offspring can be produced 
within a year. (In principle, up to forty million offspring could be produced in just three generations). Each transgenic 
female is expected to lay at least 250 eggs/year, each potentially containing hundreds of milligrams of the selected 

55 protein. Flocks of chickens numbering in the hundreds of thousands are readily handled through established commercial 
systems. The technologies for obtaining eggs and fractionating them are also well known and widely accepted. Thus, 
for each therapeutic, diagnostic, or other protein of interest, large amounts of a substantially pure material can be 
produced at relatively low incremental cost. 
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[0016] Awide range of recombinant peptides and proteins can be produced in transgenic egg-liying animals. Enzymes, 
hormones, antibodies, growth factors, serum proteins, commodity proteins, biological response modifiers, peptides and 
designed proteins may all be made through practice of the present invention. For example, rough estimates suggest 
that it is possible to produce in bull< growth hormone, insulin, or Factor VIII, and deposit them in egg whites, for an 
5 incremental cost in the order of one dollar per gram. At such prices it is feasible to consider administering such medical 
agents by inhalation or even orally, instead of through injection. Even if bioavailability rates through these avenues were 
low, the cost of a much higher effective-dose would not be prohibitiva. 

[0017] In one embodiment, the egg-laying transgenic animal is an avian. The methods of the present invention may 
be used in avians including Ratites, Psittaciformes, Falconiformes, Picifonnes, Strigiformes, Passeriformes, Coraci- 

'0 formes, Ralliformes, Cuculiformes, Columbifomies, Galliformes, Anseriformes, and Herodiones. Preferably, the egg- 
laying transgenic animal is a poultry bird. More preferably, the bird is a chicken, turl<ey, duck goose or quail. Another 
preferred bird is a ratite, such as, an emu, an ostrich, a rhea, or a cassowary. Other preferred birds are partridge, 
pheasant, kiwi, parrot, parakeet, macaw, falcon, eagle, hawk, pigeon, cockatoo, song birds, jay bird, blackbird, finch, 
warbler, canary, toucan, mynah, or sparrow. 

15 [0018] The present invention makes reference to novel transposon-based vectors. 

[0019] The present invention makes reference to novel transposon-based vectors that encode for the production of 
desired proteins or peptides in cells. 

The present invention makes reference to the production of transgenic animals through intraoviduct or intraovarian 
administration of a transposon-based vector. 
20 The present invention makes reference to the production of transgenic animals through intraoviduct or intraovarian 

administration of a transposon-based vector, wherein the transgenic animals produce desired proteins or peptides. 
[0020] The present invention makes reference to a method to produce transgenic animals through intraovarian ad- 
ministration of a transposon-based vector that are capable of producing transgenic progeny. 
[0021] An object of the present invention is to provide a method to produce transgenic animals through intraoviduct 
25 or intraovarian administration of a transposon-based vector that are capable of producing a desired molecule, such as 
a protein, peptide or nucleic acid. 

[0022] Another object of the present invention is to provide a method to produce transgenic animals through intraoviduct 
or Intraovarian administration of a transposon-based vector, wherein such administration results in modulation of en- 
dogenous gene expression. 

30 [0023] It is yet another object of the present invention to provide a method to produce transgenic avians through 
intraoviduct or intraovarian administration of a transposon-based vector that are capable of producing proteins, peptides 
or nucleic acids. 

[0024] It is another object of the present invention to produce transgenic animals through intraoviduct or intraovarian 
administration of a transposon-based vector encoding an antibody or a fragment thereof. 
35 [0025] Still another object of the present invention is to provide a method to produce transgenic avians through in- 
traoviduct or intraovarian administration of a transposon-based vector that are capable of producing proteins or peptides 
and depositing these proteins or peptides in the egg. 

[0026] Another object of the present invention is to provide transgenic avians that contain a stably incorporated trans- 
gene. 

40 [0027] Still another object of the present invention is to provide eggs containing desired proteins or peptides encoded 
by a transgene incorporated into the transgenic avian that produces the egg. 

An advantage of the present invention is that transgenic animals are produced by the method of the present invention 
with higher efficiencies than observed in the prior art. 

[0028] Another advantage of the present invention is that these transgenic animals possess high copy numbers of 

'fs the transgene. 

[0029] Another advantage of the present invention is that the transgenic animals produce large amounts of desired 
molecules encoded by the transgene. 

[0030] Still another advantage of the present invention is that desired molecules are produced by the transgenic 
animals much more efficiently and economically than prior art methods, thereby providing a means for large scale 
50 production of desired molecules, particularly proteins and peptides. 

[0031] Yet another advantage of the present invention is that the desired proteins and peptides are produced rapidly 
after making animals transgenic through introduction of the vectors of the present invention. 

[0032] These and other objects, features and advantages of the present invention will become apparent after a review 
of the following detailed description of the disclosed embodiments and claims. 

55 

BRIEF DESCRIPTION OF THE FIGURES 
[0033] 
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Figure 1 depicts schematical[y a transposon-basedvectorcontainingatransposaseoperably linked to afirst promoter 
and a gene of Interest operably-linked to a second promoter, wherein the gene of interest and its operably-linked 
promoter are flanked by insertion sequences (IS) recognized by the transposase. "Pro" designate a promoter. In 
this and subsequent figures, the size of the actual nucleotide sequence is not necessarily proportionate to the box 
5 representing that sequence. 

Figure 2 depicts schematically a transposon-based vector for targeting deposition of a polypeptide in an egg white 
wherein Ov pro is the ovalbumin promoter, Ov protein Is the ovalbumin protein and PolyA is a polyadenylation 
sequence. The TAG sequence includes a spacer sequence, the gp41 hairpin loop from HIV I and a protease cleavage 
site. 

'0 Figure 3 depicts schematically a transposon-based vector for targeting deposition of a polypeptide in an egg white 

wherein Ovo pro is the ovomucoid promoter and Ovo SS Is the ovomucoid signal sequence. The TAG sequence 

includes a spacer, the gp41 hairpin loop from HIV I and a protease cleavage site. 

Figure 4 depicts schematically a transposon based-vector for expression of an RNAi molecule. "let pro" indicates 
a tetracycline inducible promoter whereas "pro" indicates the pro portion of a prepro sequence as described herein 
15 "Ovgen" Indicates approximately 60 base pairs of an ovalbumin gene, "Ovotraas" Indicates approximately 60 base 
pairs of an ovotransferrin gene and "Ovomucin" indicates approximately 60 base pairs of an ovomucin gene. 
Figure 5 is a picture of an SDS-PAGE gel wherein a pooled fraction of an Isolated proinsulin fusion protein was run 
in lanes 4 and 6. Lanes 1 and 1 0 of the gel contain molecular weight standards, lanes 2 and 8 contain non-trangenic 
chicken egg white, and lanes 3, 5, 7 and 9 are blank. 

20 

DETAILED DESCRIPTION OF THE INVENTION 

[0034] The present invention provides a new, effective and efficient method of producing transgenic animals, i.e. birds, 
through administration of a composition comprising a transposon-based vector designed for incorporation of a gene of 

25 interest and production of a desired molecule. The transposon-based vectors are administered to an oviduct or an ovary. 
[0035] The vectors may be directly administered to an oviduct or an ovary or can be administered to an artery leading 
to an oviduct or an ovary or to a lymph system proximate to the cells to be genetically altered. The vectors may be 
administered to an oviduct or an ovary of an animal through the cloaca. One method of direct administration is by injection, 
and in one embodiment, the lumen of the magnum of the oviduct is injected with a transposon-based vector. Another 

30 method of direct administration is by injection, and in one embodiment, the lumen of, the infundibulum of the oviduct is 
injected with a transposon-based vector. A preferred inlrarterial administration is an administration into an artery that 
supplies the oviduct or the ovary. In some embodiments, administration of the transposon-based vector to an oviduct 
or an artery that leads to the oviduct results In incorporation of the vector into the epithelial and/or secretory cells of the 
oviduct. In other embodiments, administration of the transposon-based vector to an ovary or an artery that leads to the 

35 ovary or a lymphatic system proximal to the ovary results in incorporation of the vector into an oocyte or a germinal disk 
inside the ovary. 

Definition 

40 [0036] It is to be understood that as used In the specification and in the claims, "a" or 'an" can mean one or more, 
depending upon the context in which it is used. Thus, for example, reference to "a cell' can mean that at least one cell 

can be utilized. 

[0037] The term 'antibody' is used interchangeably with the term 'immunoglobulin" and is defined herein as a protein 

synthesized by an animal or a cell of the immune system in response to the presence of a foreign substance commonly 
45 referred to as an "antigen" or an "immunogen". The term antibody includes fragments of antibodies. Antibodies are 
characterized by specific affinity to a site on the antigen, wherein the site is referred to an "antigenic determinant" or an 
"epitope". Antigens can be naturally occumng or artificially engineered. Artificially engineered antigens Include, but are 
not limited to, small molecules, such as small peptides, attached to haptens such as macromolecules, for example 
proteins, nucleic acids, or polysaccharides. Artificially designed or engineered variants of naturally occurring antibodies 
50 and artificially designed or engineered antibodies not occurring in nature are all included in the current definition. Such 
variants include conservatively substituted amino acids and other fomis of substitution as described in the section 
concerning proteins and polypeptides. 

[0038] As used herein, the term "egg-laying animal" includes all amniotes such as birds, turtles, lizards and 

monotremes. Monotremes are egg-laying mammals and include the platypus and echidna. The term "bird" or "fowl," as 
55 used herein, is defined as a member of the Aves class of animals which are characterized as warm-blooded, egg-laying 
vertebrates primarily adapted for flying. Avians include, without limitation, Ratites, Psittaciformes, Falconiformes, Pici- 
formes, Strigiformes, Passeriformes, Coracifonnes, Ralliformes, Cuculiformes, Columbiformes, Galliformes, Anseri- 
formes, and Herodiones. The term "Ratite," as used herein, is defined as a group of flightless, mostly large, running 
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birds comprising several orders and including the emus, ostriches, kiwis, and cassowaries. The term "Psittacifomies", 
as used herein, includes parrots and refers to a monof amilial order of birds that exhibit zygodactylism and have a strong 
hooked bill. A "parrot" is defined as any member of the avian family Psittacidae (the single family of the Psittaciformes), 
distinguished by the short, stout, strongly hooked beak. Avians include all poultry birds, especially chickens, geese, 

5 turkeys, ducks and quail. The temi "chicken" as used herein denotes chickens used for table egg production, such as 
egg-type chickens, chickens reared for public meat consumption, or broilers, and chickens .reared for both egg and 
meat production ("dual-purpose" chickens). The term "chicken," also denotes chickens produced by primary breeder 
companies, or chickens that are the parents, grandparents, great-grandparents, etc. of those chickens reared for public 
table egg, meat, or table egg and meat consumption. 

10 [0039] The term "egg" is defined herein as including a large female sex cell enclosed in a porous, calcarous or leathery 
shell, produced by birds and reptiles. The tenn "ovum" is defined as a female gamete, and is also known as an egg. 
Therefore, egg production in all animals other than birds and reptiles, as used herein, is defined as the production and 
discharge of an ovum from an ovary, or "ovulation". Accordingly, it is to be understood that the term "egg" as used herein 
is defined as a large female sex cell enclosed in a porous, calcarous or leathery shell, when a bird or reptile produces 

15 It, or it Is an ovum when It is produced by all other animals. 

[0040] The term "milk-producing animal" refers herein to mammals including, but not limited to, bovine, ovine, porcine, 
equine, and primate animals. Milk-producing animals include but are not limited to cows, llamas, camels, goats, reindeer, 
zebu, water buffalo, yak, horses, pigs, rabbits, non-human primates, and humane. 
[0041] The term "gene" is defined herein to include a coding region for a protein, peptide or polypeptide. 

20 [0042] The term "transgenic animal" refers to an animal having at least a portion of the transposon-based vector DNA 
incorporated into its DNA. While a transgenic animal includes an animal wherein the transposon-based vector DNA is 
incorporated into the germllne DNA, a transgenic animal also includes an animal having DNA in one or more cells that 
contain a portion of the transposon-based vector DNA for any period of time. In a preferred embodiment, a portion of 
the transposon-based vector comprises a gene of interest. More preferably, the gene of interest Is incorporated into the 

25 animal's DNA for a period of at least five days, more preferably the reproductive life of the animal, and most preferably 
the life of the animal. In a further preferred embodiment, the animal is an avian. 

[0043] The term "vector" is used interchangeably with the temis "construct", "DNA construct" and "genetic construct" 
to denote synthetic nucleotide sequences used for manipulation of genetic material, including but not limited to cloning, 
suboloning, sequencing, or introduction of exogenous genetic material into cells, tissues or organisms, such as birds. It 

30 Is understood by one skilled in the art: that vectors may contain synthetic DNA sequences, naturally occurring DNA 
sequences, or both. The vectors of the present invention are transposon-based vectors as described herein. 
[0044] When referring to two nucleotide sequences, one being a regulatory sequence, the term "operably-linked" is 
defined herein to mean that the two sequences are associated in a manner that allows the regulatory sequence to affect 
expression of the other nucleotide sequence It is not required that the operably-linked sequences be directly adjacent 

35 to one another with no intervening sequence(s). 

[0045] The term "regulator sequence" is defined herein as including promoters, enhancers and other expression control 
elements such as polyadenylation sequences, matrix attachment sites, Insulator regions for expression of multiple genes 
on a single construct, ribosome entry/attachment sites, introns that are able to enhance expression, and silencers. 

10 Transposon-Based Vectors 

[0046] Transposon-based vectors according to the invention are transposon-based vectors which are used in the 
method of the present invention. 

[0047] While not wanting to be bound by the following statement, it is believed that the nature of the DNA construct 
'f5 is an important factor in successfully producing transgenic animals. The "standard" types of plasmid and viral vectors 
that have previously been almost universally used for transgenic work in all species, especially avians, have low effi- 
ciencies and may constitute a major reason for the low rates of transformation previously observed. The DNA (or RNA) 
constructs previously used often do not integrate into the host DNA, or integrate only at low frequencies. Other factors 
may have also played a part, such as poor entry of the vector into target cells. The present invention provides transposon- 
50 based vectors that can be administered to an animal that overcome the prior art: problems relating to low transgene 
integration frequencies. Two preferred transposon-based vectors of the present invention in which a tranposase, gene 
of interest and other polynucleotide sequences may be introduced are tenned pTnMCS (SEQ ID N0:2) and pTnMod 
(SEQ ID N0:3). 

[0048] The transposon-based vectors of the present invention produce integration frequencies an order of magnitude 
55 greater than has been achieved with previous vectors. More specifically, intratesticular injections peri'ormed with a prior 
art transposon-based vector (described in U.S. Patent No. 5,719,055) resulted in 41% sperm positive roosters whereas 
intratesticular injections perfonned with the novel transposon-based vectors of the present invention resulted in 77% 
sperm positive roosters. Actual frequencies of integration were estimated by either or both comparative strength of the 
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PGR signal from the sperm and histological evaluation of the testes and sperm by quantitative PGR. 
[0049] The transposon-based vectors of the present invention include a transposase gene operably-linl<ed to a first 
promoter, and a coding sequence for a desired protein or peptide operably-linked to a second promoter, wherein the 
coding sequence for the desired protein or peptide and its operably-linked promoter are flanked by transposase insertion 

5 sequences recognized by the transposase. The transposoa-based vector also includes one or more of the following 
characteristics: a) one or more modified Kozak sequences comprising AGGATG (SEQ ID N0:1) at the 3' end of the first 
promoter to enhance expression of the transposase; b) modifications of the codonsforthe first several N-terminai amino 
acids of the transposase, wherein the third base of eacli codon was changed to an A or a T without changing the 
corresponding amino acid; c) addition of one or more stop codons to enhance the termination of transposase synthesis; 

'0 and/or, d) addition of an effective polyA sequence operably-linked to the transposase to further enhance expression of 
the transposase gene. The transposon-based vector may additionally or alternatively include one or more of the following 
Kozak sequences at the 3' end of any promoter, including the promoter operably-linked to the transposase: ACCATGG 
(SEQ ID N0:4), AAGATGT (SEQ ID N0:5), ACGATGA (SEQ ID N0:6), AAGATGG (SEQ ID NQ:7), GACATGA (SEQ 
ID N0:8), AGCATGA (SEQ ID N0:9), and ACGATGA (SEQ ID NO:10), ACCATGT (SEQ ID NQ:52). 

15 [0050] Figure 1 shows a schematic representation of several components of the transposon-based vector. The present 
invention further includes vectors containing more than one gene of interest, wherein a second or subsequent gene of 
interest is operably-linked to the second promoter or to a different promoter. It is also to be understood thatthe transposon- 
based vectors shown in the Figures are representative of the present invention andthattheorderof the vector elements 
may be different than that shown in the Figures, thatthe elements may be present in various orientations, and that the 

20 vectors may contain additional elements not shown in the Figures. 

Transposases and Insertion Sequences 

[0051] In a further embodiment of the present invention, the transposase found in the transposase-based vector is an 

25 altered target site (ATS) transposase and the insertion sequences are those recognized by the ATS transposase. 
However, the transposase located in the transposase-based vectors is not limited to a modified ATS transposase and 
can be derived from any transposase. Transposases known in the prior art include those found in AG7, Tn5SEQ1, 
Tn916, Tn951, Tn1721, Tn 2410, Tn1681, Tnl, Tn2, Tn3, Tn4, Tn5, Tn6, Tn9, TnIO, Tn30, TnlOl, Tn903, Tn501, 
TnlOOO (yS), Tnl 681 , Tn2901 , ACtransposons, Mptransposons, Spmtransposons, Entransposons, Dotted transposons, 

30 Mu transposons, Ds transposons, dSpm transposons and I transposons. According to the present invention, these 
transposases and their regulatory sequences are modified for improved functioning as follows: a) the addition one or 
more modified Krozak sequences comprising AGGATG (SEQ ID N0:1) at the 3' end of the promoter operably-linked to 
the transposase; b) a change of the codons for the first several amino acids of the transposase, wherein the third base 
of each codon was changed to an A or a T without changing the corresponding amino acid; c) the addition of one or 

35 more stop codons to enhance the tennination of transposase synthesis; and/or, d) the addition of an effective polyA 
sequence operably-linked to the transposase to further enhance expression of the transposase gene. 
[0052] Although not wanting to be bound by the following statement, it is believed that the modifications of the first 
several N-terminal codons of the transposase gene increase transcription of the transposase gene, in part, by increasing 
strand dissociation. It is preferable that between approximately 1 and 20, more preferably 3 and 1 5, and most preferably 

40 between 4 and 1 2 of the first N-terminal codons of the transposase are modified such that the third base of each codon 
is changed to an A or a T without changing the encoded amino acid. In one embodiment the first ten N-terminal codons 
of the transposase gene are modified in this manner. It is also preferred that the transposase contain mutations that 
make it less specific for preferred insertion sites and thus Increases the rate of transgene insertion as discussed in U.S. 
Patent No. 5,719,055. 

'fs [0053] In some embodiments, the transposon-based vectors are optimized for expression in a particular host by 
changing the methylation patterns of the vector DNA. For example, prokaryotic methylation may be reduced by using a 
methylation deficient organism for production of the transposon-based vector. The transposon-based vectors may also 
be methylated to resemble eukaryotic DNA for expression in a eukaryotic host. 

[0054] Transposases and insertion sequences from other analogous eukaryotic transposon-based vectors that can 
50 also be modified and used are, for example, the Drosophila P element derived vectors disclosed in U.S. Patent No. 
6,291,243; the Drosophila mariner element described in Sherman et al. (1998); or the sleeping beauty transposon. See 
also Hackettetal. (1999); D. Lampe etal., 1999. Proc. Natl. Acad. Sci. USA, 96:11428-11433; S. Fischer etal., 2001. 
Proc. Natl. Acad. Sci. USA, 98:6759-6764; L Zagoraiou etal., 2001. Proc. Natl. Acad. Sci. USA, 98:11474-11478; and 
D. Berg et al. (Eds.), Mobile DNA, Amer. Soc. Microbiol. (Washington, D.C., 1989). However, it should be noted that 
55 bacterial transposon-based elements are preferred, as there is less likelihood that a eukaryotic transposase in the 
recipient species will recognize prokaryotic insertion sequences bracketing the transgene. 

[0055] Many transposases recognize different insertion sequences, and therefore, it is to be understood that a trans- 
posase-based vector will contain insertion sequences recognized by the particular transposase also found in the trans- 
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posase-based vector. In a preferred embodiment of the invention, the insertion sequences have been shortened to about 
70 base pairs in length as compared to those found in wild-type transposons that typically contain insertion sequences 
of well over 1 00 base pairs. 

[0056] While the examples provided below incorporate a "cut and insert" Tn1 0 based vector that is destroyed following 
5 the insertion event, the present Invention also encompasses the use of a "rolling replication" type transposon-based 
vector. Use of a rolling replication type transposon allows multiple copies of the transposon/transgene to be made from 
a single transgene construct and the copies inserted. This type of transposon-based system thereby provides for insertion 
of multiple copies of a transgene into a single genome. A rolling replicationtypehansposon-based vector may be preferred 
when the promoter operably-linked to gene of interest is endogenous to the host cell and present in a high copy number 
'0 or highly expressed. However, use of a rolling replication system may require tight control to limit the insertion events 
to non-lethal levels. Tnl, Tn2, Tn3, Tn4, Tn5, Tn9, Tn21, Tn501, Tn551, Tn951, Tn1721, Tn2410 and Tn2603 are 
examples of a rolling replication type transposon, although Tn5 could be both a rolling replication and a cut and insert 
type transposon. 

15 Stop Codons and PolyA Sequences, 

[0057] In one embodiment, the transposon-based vector contains two stop codons operably-linked to the transposase 
and/or to the gene of interest In an alternate embodiment, one stop codon of UAA or UGA Is operably linked to the 
transposase and/or to the gene of interest. 

20 [0058] As used herein an "effective polyAsequence" refers to eitherasynthetic or non-syntheticsequencethatcontains 
multiple and sequential nucleotides containing an adeninebase (an A polynucleotide string) and that increases expression 
of the gene to which it is operably-linked. A polyA sequence may be operably-linked to any gene in the transposon- 
based vector including, but not limited to, a transposase gene and a gene of interest. A preferred polyA sequence is 
optimized for use In the host animal or human. In one embodiment, the polyA sequence is optimized for use in an avian 

25 species and more specifically, a chicken. An avian optimized polyA sequence generally contains a minimum of 40 base 
pairs, preferably between approximately 40 and several hundred base pairs, and more preferably approximately 75 base 
pairs that precede the A polynucleotide string and thereby separate the stop codon from the A polynucleotide string. In 
one embodiment of the present invention, the polyA sequence comprises a conalbumin polyA sequence as provided in 
SEQ ID N0:1 1 and as taken from GenBank accession #Y00407, base pairs 10651-11058. In another embodiment, the 

30 polyA sequence comprises a synthetic polynucleotide sequence shown in SEQ ID N0:12. In yet another embodiment, 
the polyA sequence comprises an avian optimized polyA sequence provided in SEQ ID NO: 13. A chicken optimized 
polyA sequence may also have a reduced amount of CT repeats as compared to a synthetic polyA sequence. 
[0059] It is a surprising discovery of the present invention that such an avian optimized poly A sequence increases 
expression of a polynucleotide to which it is operably-linked in an avian as compared to a non-avian optimized polyA 

35 sequence. Accordingly, the present invention includes methods of or increasing incorporation of a gene of interest 
wherein the gene of interest resides in a transposon-based vector containing a transposase gene and wherein the 
transposase gene is operably linked to an avian optimized polyAsequence. The present Invention also includes methods 
of increasing expression of a gene of interest in an avian that includes administering a gene of interest to the avian, 
wherein the gene of interest is operably-linked to an avian optimized polyAsequence. An avian optimized polyA nucleotide 

40 string is defined herein as a polynucleotide containing an A polynucleotide string and a minimum of 40 base pairs, 
preferably between approximately 40 and several hundred base pairs, and more preferably approximately 60 base pairs 
that precede the A polynucleotide string. The present invention further provides transposon-based vectors containing a 
gene of interest or transposase gene operably linked to an avian optimized polyA sequence. 

45 Promoters and Enhancers 

[0060] The first promoter operably-linked to the transposase gene and the second promoter operably-linked to the 
gene of interest can be a constitutive promoter or an inducible promoter. Constitutive promoters include, but are not 

limited to, immediate early cytomegalovirus (CMV) promoter, herpes simplex virus 1 (HSV1 ) immediate early promoter, 
50 SV40 promoter, lysozyme promoter, early and late CMV promoters, early and late HSV promoters, {i-actin promoter, 
tubulin promoter, Rous-Sarcoma virus (RSV) promoter, and heat-shock protein (HSR) promoter. Inducible promoters 
include tissue-specific promoters, developmentally-regulated promoters and chemically inducible promoters. Examples 
of tissue-specific promoters include the glucose 6 phosphate (G6P) promoter, vitellogenin promoter, ovalbumin promoter, 
ovomucoid promoter, conalbumin promoter, ovotransferrin promoter, prolactin promoter, kidney uromodulin promoter, 
55 and placental lactogen promoter. In one embodiment, the vitellogenin promoter includes a polynucleotide sequence of 
SEQ ID NO: 14. The G6P promoter sequence may be deduced from a rat G6P gene untranslated upstream region 
provided in GenBankaccession number U57552.1 . Examples of developmentally-regulated promoters include the home- 
obox promoters and several hormone induced promoters. Examples of chemically inducible promoters include repro- 
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ductive hormone induced promoters and antibiotic Inducible promoters such as the tetracycline inducible promoter and 
the zinc-inducible" metallothionine promoter. 

[0061] Other inducible promotersystems include the Lac operator repressor system inducible by IPTG (iscpropyl beta- 
D-thiogalactoside) (Cronin, A. et aL 2001 . Genes and Development, v. 1 5), ecdysone-based Inducible systems (Hoppe, 

5 U. C. et al. 2000. Mol. Ther. 1:159-164); estrogen-based inducible systems (Braselmann, S. et al. 1993. Proc. Natl. 
Acad. Sci. 90:1657-1661); progesterone-based inducible systems using a chimeric regulator, GLVP, which is a hybrid 
protein consisting of the GAL4 binding domain and the herpes simplex virus transcriptional activation domain, VP1 6, 
and a truncated form of the human progesterone receptor that retains the ability to bind ligand and can be turned on by 
RU486 (Wang, et al. 1 994. Proc. Natl. Acad. Sci. 91 :81 80-81 84); CID-based Inducible systems using chemical inducers 

'0 of dimerization (CIDs) to regulate gene expression, such as a system wherein rapamycin induces dimerlzatlon of the 
cellular proteins FKBP12 and FRAP (Belshaw, P. J. et al. 1996. J. Chem. Biol. 3:731-738; Fan, L. et al. 1999. Hum. 
Gene Ther. 10:2273-2285; Shariat, S.F. et al. 2001. Cancer Res. 61:2562-2571; Spencer, D.M. 1996. Curr. Biol. 6: 
839-847). Chemical substances that activate the chemically Inducible promoters can be administered to the animal 
containing the transgene of interest via any method known to those of skill in the art. 

15 [0062] Other examples of cell ortlssue-speclfic and constitutive promoters Include but are not limited to smooth-muscle 
SM22 promoter, including chimeric SM22alpha/telokin promoters (Hoggatt A.M. et al., 2002. Circ Res. 91(12):1151-9); 
ubiquitin C promoter (Biochim Biophys Acta, 2003. Jan. 3;1625{1):52-63); Hsf2 promoter; murine COMP (cartilage 
oligomeric matrix protein) promoter, early B cell-specific mb-1 promoter (SIgvardsson M., et al., 2002. Mol. Cell Biol. 22 
(24);8539-51); prostate specific antigen (PSA) promoter (Yoshimura I. et al., 2002, J. Urol. 168(6):2659-64); exorh 

20 promoter and pineal expression-promoting element (Asaoka Y., et al., 2002. Proc. Natl. Acad. Sci. 99(24): 15456-61); 
neural and liver ceramidase gene promoters (Okino N. etal., 2002. Biochem. Biophys. Res. Commun. 299(1):160-6); 
PSP94 gene promoter/enhancer (Gabrll M.Y. etal., 2002. Gene Ther. 9(23):1 589-99); promoter of the human FAT/CD36 
gene (Kuriki C, etal., 2002. Biol. Pharm. Bull. 25(1 1):1476-8); VL30 promoter (Staplin W.R. etal., 2002. Blood October 
24, 2002); and, IL-10 promoter (Brenner S., etal., 2002. J. Biol. Chem. December 18, 2002). 

25 [0063] Examples of avian promoters include, but are not limited to, promoters controlling expression of egg white 
proteins, such as ovalbumin, ovotransferrin (conalbumin), ovomucoid, lysozyme, ovomucin, g2 ovoglobulin, g3 ovoglob- 
ulin, ovoflavoprotein, ovostatin (ovomacroglobin), cystatin, avidin, thiamlne-binding protein, glutamyl aminopeptldase 
minor glycoprotein 1, minor glycoprotein 2; and promoters controlling expression of egg-yolkproteins, such as vitellogenin, 
very low-density llpoprDteins, low density lipoprotein, cobalamin-binding protein, riboflavin-blnding protein, biotin-binding 

30 protein (Awade, 1996. Z. Lebensm. Unters. Forsch. 202:1-14). An advantage of using the vitellogenin promoter is that 
it is active during the egg-laying stage of an animal's life-cycle, which allows for the production of the protein of interest 
to be temporally connected to the Import of the protein of Interest, into the egg yolk when the protein of Interest Is equipped 
with an appropriate targeting sequence. In some embodiments, the avian promoter is an oviduct-specific promoter. As 
used herein, the term "oviduct-specific promoter" includes, but is not limited to, ovalbumin; ovotransferrin (conalbumin); 

35 ovomucoid; 01 , 02, 03, 04 or05 avidin; ovomucin; g2 ovoglobulin; g3 ovoglobulin; ovoflavoprotein; and ovostatin (ovomac- 
roglobin) promoters. 

[0064] When germline transformation occurs via Intraovarian administration, liver-specific promoters may be operably- 
linked to the gene of interest to achieve liver-specific expression of the transgene. Liver-specific promoters of the present 
invention include, but are not limited to, the following promoters, vitellogenin promoter, G6P promoter, cholesterol-7- 
40 alpha hydroxylase (CYP7A) promoter, phenylalanine hydroxylase (PAH) promoter, protein C gene promoter, insulin- 
like growth factor I (IGF-1) promoter, bilirubin UDP-glucuronosyltransferase promoter, aldolase B promoter, furin pro- 
moter, metallothionelne promoter, albumin promoter, and insulin promoter. 

[0065] Also Included in the present invention are promoters that can be used to target expression of a protein of interest 
into the milk of a milk-producing animal including, but not limited to, p lactoglobin promoter, whey acidic protein promoter, 

'fs lactalbumin promoter and casein promoter. 

[0066] When germline transformation occurs via intraovarian administration, Immune system-specific promoters may 
be operably-llnked to the gene of interest to achieve immune system-specific expression of the transgene. Accordingly, 
promoters associated with cells of the immune system may also be used. Acute phase promoters such as interleukin 
(IL)-1 and IL-2 may be employed. Promoters for heavy and light chain Ig may also be employed. The promoters of the 

so T cell receptor components CD4 and CDS, B cell promoters and the promoters of CR2 (complement receptor type 2) 
may also be employed. Immune system promoters are preferably used when the desired protein is an antibody protein. 
[0067] Also included In this invention are modified promoters/enhancers wherein elements of a single promoter are 
duplicated, modified, or otherwise changed. In one embodiment, steroid hormone-binding domains of the ovalbumin 
promoter are moved from about -6.5 kb to within approximatelythe first 1000 base pairs of the gene of Interest Modifying 

55 an existing promoter with promoter/enhancer elements not found naturally In the promoter, as well as building an entirely 
synthetic promoter, or drawing promoter/enhancer elements from various genes together on a non-natural backbone, 
are all encompassed by the cun-ent invention. 

[0068] Accordingly, it is to be understood that the promoters contained within the transposon-based vectors of the 
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present invention may be entire promoter sequences or fragments of promoter sequences. For example, in one embod- 
iment, the promoter operably linl<ed to a gene of interest is an approximately 900 base pair fragment of a chicl<en 
ovalbumin promoter (SEQ ID N0:15). The constitutive and inducible promoters contained within the transposon-based 
vectors may also be modified by the addition of one or more modified Kozak sequences of ACCATG (SEQ ID N0:1 ). 

5 [0069] As indicatedabove, the presentinventionincludestransposon-basedvectorscontaining one ormore enhancers. 
These enhances may or may not be operably-linked to their native promoter and may be located at any distance from 
their operably-linl<ed promoter. A promoter operably-linked to an enhancer and a promoter modified to eliminate repres- 
sive regulatory effects are referred to herein as an "enhanced promoter." The enhancer contained within the transposon- 
based vectors are preferably enhancers found in birds, and more preferably, an ovalbumin enhancer, but are not limited 

'0 to these types of enhancers. In one embodiment, an approximately 675 base pair enhancer element of an ovalbumin 
promoter is cloned upstream of an ovalbumin promoter with 300 base pairs of spacer DNA separating the enhancer and 
promoter. In one embodiment, the enhancer used as a part of the present invention comprises base pairs 1-675 of a 
chicken ovalbumin enhancer from GenBank accession #582527.1. The polynucleotide sequence of this enhancer is 
provided in SEQ ID NO: 16. 

15 [0070] Also Included in some of the transposon-based vectors of the present invention are cap sites and fragments 

of cap sites. In one embodiment, approximately 50 base pairs of a 5' untranslated region wherein the capsite resides 
are added on the 3' end of an enhanced promoter or promoter. An exemplary 6' untranslated region is provided in SEQ 
ID NO: 1 7. A putative cap-site residing in this 5' untranslated region preferably-comprises the polynucleotide sequence 
provided in SEQ ID NO: 18. 

20 [0071] In one embodiment of the present invention, the first promoter operably-linked to the transposase gene is a 

constitutive promoter and the second promoter operably-linked to the gene of interest is a tissue-specific promoter. In 
the second embodiment, use of the first constitutive promoter allows for constitutive activation of the transposase gens 
and incorporation of the gene of interest, into virtually all cell types, including the germiine of the recipient animal. Although 
the gene of interest is incorporated into the germiine generally, the gene of interest may only be expressed in a tissue- 
25 specific manner. A transposon-based vector having a constitutive promoter operably-linked to the transposase gene 
can be administered by any route; and in one embodiment, the vector is administered to an ovary, to an artery leading 
to the ovary or to a lymphatic system or fluid proximal to the ovary. 

[0072] It should be noted that cell- or tissue-specific expression as described herein does not require a complete 
absence of expression in cells ortissues otherthan the preferred cell or tissue. Instead, "cell-specific" or "tissue-specific" 

30 expression refers to a majority of the expression of a particular gene of interest in the preferred cell ortissue, respectively. 
[0073] When incorporation of the gene of interest into the germiine is not preferred, the first promoter operably-linked 
to the transposase gene can be a tissue-specific promoter. For example, transfection of a transposon-based vector 
containing a transposase gene operably-linked to an oviduct specific promoter such as the ovalbumin promoter provides 
for activation of the transposase gene and incorporation of the gene of interest in the cells of the oviduct but not into the 

35 germiine and other cells generally. In this embodiment, the second promoter operably-linked to the gene of interest can 
be a constitutive promoter or an inducible promoter. In a preferred embodiment, both the first promoter and the second 
promoter are an Ovalbumin promoter. In embodiments wherein tissue-specific expression or Incorporation is desired, it 
is preferred that the transposon-based vector is administered directly to the tissue of interest, to an artery leading to the 
tissue of interest or to fluids surrounding the tissue of interest. In a preferred embodiment, the tissue of interest is the 

10 oviduct and administration is achieved by direct injection into the oviduct or an artery leading to the oviduct In a further 
preferred embodiment, administration is achieved by direct injection into the lumen of the magnum or the infundibulum 
of the oviduct. Indirect administration to the oviduct may occur through the cloaca. 

[0074] Accordingly, cell specific promoters may be used to enhance transcription in selected tissues. In bids, for 

example, promoters that are found in cells of the fallopian tube, such as ovalbumin, conalbumin, ovomucoid and/or 
45 lysozyme, are used in the vectors to ensure transcription of the gene of interest in the epithelial cells and tubular gland 
cells of the fallopian tube, leading to synthesis of the desired protein encoded by the gene and deposition into the egg 
white. In mammals, promoters specific for the epithelial cells of the alveoli of the mammary gland, such as prolactin, 
insulin, beta lactoglobin, whey acidic protein, lactalbumin, casein, and/or placental lactogen, are used in the design of 
vectors used for transfection of these cells for the production of desired proteins for deposition into the milk. In liver cells, 
50 the G6P promoter may be employed to drive transcription of the gene of interest for protein production. Proteins made 
in the liver of birds may be delivered to the egg yolk. 

[0075] In order to achieve higher or more efficient expression of the transposase gene, the promoter and other regu- 
latory sequences operably-linked to the transposase gene may be those derived from the host. These host specific 
regulatory sequences can be tissue specific as described above or can be of a constitutive nature. For example, an 
55 avian actin promoter and its associated polyA sequence can be operably-linked to a transposase in a transposase- 
based vector for transfection into an avian. Examples of other host specific promoters that could be operably-linked to 
the transposase include the myosin and DNA or RNA polymerase promoters. 
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Directing Sequences 

[0076] In some embodiments of tine present invention, the gene of interest is operably-linked to a directing sequence 
or a sequence that provides proper conformation to the desired protein encoded by the gene of interest. As used herein, 

5 the term "directing sequence" refers to both signal sequences and targeting sequences. An egg directing sequence 
includes, but is not limited to, an ovomucoid signal sequence, an ovalbumin signal sequence, a cecropin pre pro signal 
sequence, and a vitellogenin targeting sequence. The term "signal sequence" refers to an amino acid sequence, or the 
polynucleotide sequence that encodes the amino acid sequence, that directs the protein to which it is linked to the 
endoplasmic reticulum in a eukaryote, and more preferably the translocatlonal pores in the endoplasmic reticulum, or 

'0 the plasma membrane in a prokaryote, or mitochondria, such as for the purpose of gene therapy for mitochondrial 
diseases. Signal and targeting sequences can be used to direct a desired protein Into, for example, the milk, when the 
transposon-based vectors are administered to a milk-producing animal. 

[0077] Signal sequences can also be used to direct a desired protein into, for example, a secretory pathway for 
incorporation into the egg yolk or the egg white, when the transposon-based vectors are administered to a bird or other 

15 egg-laying animal. One example of such a transposon-based vector is provided In Figure 3 wherein the gene of Interest 
is operably linked to the ovomucoid signal sequence. The present invention also includes a gene of interest operably- 
linked to a second gene containing a signal sequence. An example of such an embodiment is shown in Figure 2 wherein 
the gene of interest is operably-linked to the ovalbumin gene that contains an ovalbumin signal sequence. Other signal 
sequences that can be included in the transposon-based vectors include, but are not limited to the ovotransferrin and 

20 lysozyme signal sequences. In one embodiment, the signal sequence Is an ovalbumin signal sequence Including a 
sequence shown in SEQ ID N0:1 9. In another embodiment, the signal sequence is amodified ovalbumin signal sequence 
including a sequence shown in SEQ ID NO:20 or SEQ ID N0:21. 

[0078] As also used herein, the term "targeting sequence" refers to an amino acid sequence, or the polynucleotide 
sequence encoding the amino acid sequence, which amino acid sequence is recognized by a receptor located on the 

25 exterior of a cell. Binding of the receptor to the targeting sequence results in uptake of the protein or peptide operably- 
linked to the targeting sequence by the cell. One example of a targeting sequence is a vitellogenin targeting sequence 
that is recognized by a vitellogenin receptor (or the low density lipoprotein receptor) on the exterior of an oocyte. In one 
embodiment, the vitellogenin targeting sequence includes the polynucleotide sequence of SEQ ID NO:22. In another 
embodiment, the vitellogenin targeting sequence Includes all or part of the vitellogenin gene. Othertargeting sequences 

30 include VLDL and Apo E, which are also capable of binding the vitellogenin receptor. Since the ApoE protein is not 
endogenously expressed in birds, its presence may be used advantageously to identify birds carrying the transposon- 
based vectors of the present invention. 

Genes of Interest Encoring Desired Proteins 

35 

[0079] A gene of interest selected for stable incorporation is designed to encode any desired protein or peptide or to 
regulate any cellular response. In some embodiments, the desired proteins or peptides are deposited In an egg. It is to 
be understood that the present invention encompasses transposon-based vectors containing multiple genes of Interest 
The multiple genes of interest may each be operably-linked to a separate promoter and other regulatory sequence(s) 

10 or may all be operably-linked to the same promoter and other regulatory sequences(s). In one embodiment, multiple 
gene of interest are linked to a single promoter and other regulatory sequence(s) and each gene of interest is separated 
by a cleavage site or a pro portion of a signal sequence. A gene of interest may contain modifications of the codons for 
the first several N-termlnal amino acids of the gene of interest, wherein the third base of each codon is changed to an 
A or a T without changing the corresponding amino acid. 

45 [0080] Protein and peptide hormones are a preferred class of proteins in the present invention. Such protein and 
peptide hormones are synthesized throughout the endocrine system and include, but are not limited to, hypothalamic 
hormones and hypophysiotropic hormones, anterior, intennediate and posterior pituitary hormones, pancreatic islet 
homnones, hormones made In the gastrointestinal system, renal hormones, thymic hormones, parathyroid hormones, 
adrenal cortical and medullary hormones. Specifically, hormones that can be produced using the present invention 

50 include, but are not limited to, chorionic gonadotropin, corticotropin, erythropoietin, glucagons, IGF-1 , oxytocin, platelet- 
derived growth factor, calcitonin, follicle-stimulating hormone, luteinizing hormone, thyroid-stimulating hormone, insulin, 
gonadotropin-releasing hormone and its analogs, vasopressin, octreotide, somatostatin, prolactin, adrenocorticotropic 
hormone, antidiuretic hormone, thyrotropin-releasing homnone (TRH), gnDwth homione-releasing hormone (GHRH), 
dopamine, melatonin, thyroxin (T4), parathyroid hormone (PTH), glucocorticoids such as Cortisol, mineralocorticoids 

55 such as aldosterone, androgens such as testosterone, adrenaline (epinephrine), noradrenaline (norepinephrine), estro- 
gens such as estradiol, progesterone, glucagons, calcitol, calciferol, atrial-natriuretic peptide, gastrin, secretin, chole- 
cystolonin (CCK), neuropeptide Y, ghrelin, PYY3.gg, angiotensinogen, thrombopoietin, and leptin. By using appropriate 
polynucleotide sequences, species-specific hormones may be made by transgenic animals. 
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[0081] In one embodiment of the present invention, the gene of Interest is a prolnsulin gene and the desired molecule 
is insulin. Prolnsulin consists of three parts: a C-peptide and two strands of amino acids (the alpha and beta chains) that 
later become linked together tao form the insulin molecule. Figures 2 and 3 are schematics of transposon-based vector 
constructs containing a proinsulln gene operably-linkedto an ovalbumin promoter and ovalbumin protein or an ovomucoid 

5 promoter and ovomucoid signal sequence, respectively. In these embodiments, prolnsulin is expressed In the oviduct 
tubular gland cells and then deposited In the egg white. One example of a prolnsulin polynucleotide sequence is shown 
in SEQ ID NO:23, wherein the C-peptide cleavage site spans from Arg at position 31 to Arg at position 65. 
[0082] Serum proteins including lipoproteins such as high density lipoprotein (HDL), HDL-Milano and low density 
lipoprotein, albumin, clotting cascade factors, factor VIII, factor IX. fibrinogen, and globulins are also Included in the 

'0 group of desired proteins of the present invention. Immunoglobulins are one class of desired globulin molecules and 
include but are not limited to IgG, IgM, IgA, IgD, IgE, IgY, lambda chains, l<appa chains and fragments thereof; Fc 
fragments, and Fab fragments. Desired antibodies include, but are not limited to, naturally occurring antibodies, human 
antibodies, humanized antibodies, and hybrid antibodies. Genes encoding modified versions of naturally occurring an- 
tibodies or fragments thereof and genes encoding artificially designed antibodies or fragments thereof may be Incorpo- 

15 rated Into the transposon-based vectors of the present Invention. Desired antibodies also Include antibodies with the 
ability to bind specific llgands, for example, antibodies against proteins associated with cancer-related molecules, such 
as anti-her 2, or anti-CA125. Accordingly, the present Invention encompasses a transposon-based vector containing 
one or more genes encoding a heavy Immunoglobulin (Ig) chain and a light Ig chain. Further, more than one gene 
encoding for more than one antibody may be administered in one or more ttansposon-based vectors of the present 

20 Invention. In this manner, an egg may contain more than one type of antibody In the egg white, the egg yolk or both. In 
one embodiment, a transposon-based vector contains a heavy Ig chain and a light Ig chain, both operably linked to a 
promoter. 

[0083] Antibodies used as therapeutic reagents include but are not limited to antibodies for use in cancer Immuno- 
therapy against specific antigens, orfor providing passive immunity to an animal or a human against an Infectious disease 

25 or a toxic agent. Antibodies used as diagnostic reagents Include, but are not limited to antibodies that may be labeled 
and detected with a detector, for example antibodies with a fluorescent label attached that may be detected following 
exposure to specific wavelengths. Such labeled antibodies may be primary antibodies directed to a specific antigen, for 
example, rhodamine-labeled rabbit anti-growth hormone, or may be labeled secondary antibodies, such as fluoresceln- 
labeled goat-anti chicken IgG. Such labeled Antibodies are known to one of ordinary skill In the art. Labels useful for 

30 attachment to antibodies are also known to one of ordinary skill In the art. Some of these labels are described In the 
"Handbook of Fluorescent Probes and Research Products", ninth edition, Richard P. Haugland (ed) Molecular Probes, 
Inc. Eugene, OR), which is incorporated herein in its entirety. 

[0084] Antibodies produced with using the present Invention may be used as laboratory reagents for numerous ap- 
plications Including radioimmunoassay, western blots, dot blots, ELISA, immunoafflnity columns and other procedures 
35 requiring antibodies as known to one of ordinary skill In the art:. Such antibodies include primary antibodies, secondary 

antibodies and tertiary antibodies, which may be labeled or unlabeled. 

[0085] Antibodies that may be made with the practice of the present Invention Include, but are not limited to primary 
antibodies, secondary antibodies, designer antibodies, anti-protein antibodies, anti-peptlde antibodies, anti-DNA anti- 
bodies, anti-RNA antibodies, anti-honnone antibodies, antl-hypophyslotropic peptides, antibodies against non-natural 

10 antigens, anti-anterior pituitary honnone antibodies, anti-posterior pituitary hormone antibodies, anti-venom antibodies, 
anti-tumor marker antibodies, antibodies directed against epitopes associated with Infectious disease, including, antiviral, 
anti-bacterial, antl-protozoal, antl-fungal, anti-parasltic, anti-receptor, antl-lipid, anti-phospholipid, anti-growth factor, 
anti-cytoklne, antl-monokine, anti-ldiotype, and anti-accessory (presentation) protein antibodies. Antibodies made with 
the present Invention, as well as light chains or heavy chains, may also be used to inhibit enzyme activity. 

45 [0086] Antibodies that may be produced using the present Invention Include, but are not limited to, antibodies made 
against the following proteins: Bovine y-Globulin, Serum; Bovine IgG, Plasma; Chicken y-Globulin, Serum; Human y- 
Globulin, Serum; Human IgA, Plasma; Human IgA.,, Myeloma; Human IgAj, Myeloma; Human lgA2, Plasma; Human 
IgD, Plasma; Human IgE, Myeloma; Human IgG, Plasma; Human IgG, Fab Fragment, Plasma; Human IgG, F(ab')2 
Fragment, Plasma; Human IgG, Fc Fragment, Plasma; Human IgG^, Myeloma; Human IgGg, Myeloma; Human IgGg, 

50 Myeloma; Human lgG4, Myeloma; Human IgM, Myeloma; Human IgM, Plasma; Human Immunoglobulin, Light Chain k, 
Urine; Human Immunoglobulin, Light Chains /cand y, Plasma; Mouse y-Globulin, Serum; Mouse IgG, Serum; Mouse 
IgM, Myeloma; Rabbit y-Globulin, Serum; Rabbit IgG, Plasma; and Rat y-Globulin, Serum. In one embodiment, the 
transposon-based vector comprises the coding sequence of light and heavy chains of a murine monoclonal antibody 
that shows specificity for human semlnoprotein (GenBank Accession numbers AY129006 and AY129304 for the light 

55 and heavy chains, respectively). 

[0087] A further non-limiting list of antibodies that recognize other antibodies is as follows: Anti-Chicken IgG, heavy 
(H) & light (L) Chain Specific (Sheep); Anti-Goat y-Globulln (Donkey); Anti-Goat IgG, Fc Fragment Specific (Rabbit); 
Antl-Gulnea Pig y-Globulin (Goat); Anti-Human Ig, Light Chain, Type /c Specific; Anti-Human Ig, Light Chain, Type X 
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Specific; Anti-Human IgA, a-Chain Specific (Goat); Anti-Human IgA, Fab Fragment Specific; Anti-Human IgA, Fc Frag- 
ment Specific; Anti-Human IgA, Secretory; Anti-Human IgE, e-Chain Specific (Goat); Anti-Human IgE, Fc Fragment 
Specific; Anti-Human IgG, Fc Fragment Specific (Goat); Anti-Human !gG, y-Chain Specific (Goat); Anti-Human IgG, Fc 
Fragment Specific; Anti-Human IgG, Fd Fragment Specific; Anti-Human IgG, H & L Chain Specific (Goat); Anti-Human 

5 IgG,, Fc Fragment Specific; Anti-Human IgGj, Fc Fragment Specific; Anti-Human IgGj, Fd Fragment Specific; Anti- 
Human IgGg, Hinge Specific; Anti-Human lgG4, Fc Fragment Specific; Anti-Human IgM, Fc Fragment Specific; Anti- 
Human IgWI, n-Clnain Specific; Anti-Mouse IgE, e-Chain Specific; Anti-Mouse y-Globulin (Goat); Anti-Mouse IgG, y-Chain 
Specific (Goat); Anti-Mouse IgG, y-Chain Specific (Goat) F(ab')2 Fragment; Anti-Mouse IgG, H & L Ciiain Specific (Goat); 
Anti-Mouse IgM, n,-Chaln Specific (Goat); Anti-Mouse IgM, H & L Chain Specific (Goat); Anti-Rabbit y-Globulin (Goat); 

10 Anti-Rabbit IgG, Fc Fragment Specific (Goat); Anti-Rabbit IgG, H & L Chain Specific (Goat); Anti-Rat y-Globulln (Goat); 
Anti-Rat IgG, H & L Chain Specific; Anti-Rhesus Monkey y-Globulin (Goat); and, Anti-Sheep IgG, H & L Chain Specific. 
[0088] Another non-limiting list of the antibodies that may be produced using the present invention is provided in 
product catalogs of companiessuch as Phoenix Pharmaceuticals, Inc. (www.phoenixpeptide.com; 530 Harbor Boulevard, 
Belmont, CA), Peninsula Labs (San Carlos CA), SIGMA (St Louis, MO www.sigma-aldrich.com), Cappel ICN (Irvine, 

15 California, www.lcnblomed.com), and Calblochem (La Jolla, California, www.calbiochem.com), which are all Incorporated 
herein by reference in their entirety. The polynucleotide sequences encoding these antibodies may be obtained from 
the scientific literature, from patents, and from databases such as GenBank. Alternatively, one of ordinary skill in the art 
may design the polynucleotide sequence to be incorporated into the genome by choosing the codons that encode for 
each amino acid in the desired antibody. Antibodies made by the transgenic animals of the present Invention include 

20 antibodies that may be used as therapeutic reagents, for example In cancer Immunotherapy against specific antigens, 
as diagnostic reagents and as laboratory reagents for numerous applications including immunoneutralization, radioim- 
munoassay, western blots, dot blots, ELISA, immunopreoipitation and immunoaffinity columns. Some of these antibodies 
include, but are not limited to, antibodies which bind the following ligands: adrenomedulin, amylin, calcitonin, amyloid, 
calcitonin gene-related peptide, cholecystokinin, gastrin, gastric inhibitory peptide, gastrin releasing peptide, interleukin, 

25 Interferon, cortlstatin, somatostatin, endothelin, sarafotoxin, glucagon, glucagon like peptide, insulin, atrial natriuretic 
peptide, BNP, CNP, neurokinin, substance P, leptin, neuropeptide Y, melanin concentrating hormone, melanocyte stim- 
ulating hormone, orphanin, endorphin, dynorphin, enkephalin, enkephalin, leumorphin, peptide F, PACAP, PACAP- 
related peptide, parathyroid hormone, urocortin, corticotrophin releasing hormone, PHM, PHI, vasoactive intestinal 
polypeptide, secretin, ACTH, angiotensin, anglostatin, bombesin, endostatin, bradyklnin, FMRF amide, galanin, gona- 

30 dotropin releasing hormone (GnRH) associated peptide, GnRH, growth honnone releasing hormone, inhibin, granulocyte- 
macrophage colony stimulating factor (GM-CSF), motilin, neurotensin, oxytocin, vasopressin, osteocalcin, pancreastatin, 
pancreatic polypeptide, peptide YY, proopiomelanocortin, transforming growth factor, vascular endothelial grovrth factor, 
vesicular monoamine transporter, vesicular acetylcholine transporter, ghrelln, NPW, NPB, C3d, proklnetlcan, thyroid 
stimulating honnone, luteinizing hormones, follicle stimulating hormone, prolactin, growth hormone, beta-lipotropin, me- 
ss latonin, kalllkriens, kinins, prostaglandins, erythropoietin, pi 46 (SEQ ID NO:24 amino acid sequence, SEQ ID NO:25, 
nucleotide sequence), estrogen, testosterone, corticosteroids, mineralocorticoids, thyroid hormone, thymic hormones, 
connective tissue proteins, nuclear proteins, actin, avidin, activin, agrin, albumin, and prohormones, propeptides, splice 
variants, fragments and analogs thereof. 

[0089] The following Is yet another non-limiting list of antibodies that can be produced by the methods of present 

40 invention: abciximab (ReoPro), abclximab antiplatelet aggregation monoclonal antibody, anti-CDIla (hull 24), anti- 
CD18 antibody, anti-CD20 antibody, anti-cytomegalovitus (CMV) antibody, anti-digoxin antibody, anti- hepatitis B anti- 
body, anti-HER-2 antibody, anti-ldiotype antibody to GD3 glycollpld, antl-lgE antibody, antl-lL-2R antibody, antimetastatic 
cancer antibody (mAb 17-1 A), anti-rabies antibody, anti-respiratory syncytial virus (RSV) antibody, anti-Rh antibody, 
anti-TCR, anti-TNF antibody, anti-VEGF antibody and fab fragment thereof, rattlesnake venom antibody, black widow 

'f5 spider venom antibody, coral snake venom antibody, antibody against very late antigen-4 (VLA-4), C225 humanized 
antibody to EGF receptor, chimeric (human & mouse) antibody against TNFa, antibody directed against GPIIb/llla 
receptor on human platelets, gamma globulin, anti-hepatitis B immunoglobulin, human anti-D immunoglobulin, human 
antibodies against Saureus, human tetanus immunoglobulin, humanized antibody againstthe epidermal growth receptor- 
2, humanized antibody against the a subunit of the interleukin-2 receptor, humanized antibody CTLA4IG, humanized 

50 antibody to the IL-2 R a-chain, humanized antl-CD40-ligand monoclonal antibody (5c8), humanized mAb against the 
epidermal growth receptor-2, humanized mAb to rous sarcoma virus, humanized recombinant antibody (IgGI k) against 
respiratory syncytial virus (RSV), lymphocyte immunoglobulin (anti-thymocyte antibody), lymphocyte Immunoglobulin, 
mAb against factor Vll, MDX-21 0 bl-specific antibody against HER-2, MDX-22, MDX-220 bi-specific antibody against 
TAG-72 on tumors, MDX-33 antibody to FcyRI receptor, MDX-447 bi-specific antibody against EGF receptor, MDX-447 

55 bispecific humanized antibody to EGF receptor, MDX-RA immunotoxin (ricin A linked) antibody, Medi-507 antibody 
(humanized form of BTI-322) against CD2 receptor on T-cells, monoclonal antibody LDP-02, muromonab-CD3(OKT3) 
antibody, 0KT3 ("muramomab-CD3") antibody, PRO 542 antibody, ReoPro ("abciximab") antibody, and TNP-IgG fusion 
protein. 
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[0090] The antibodies prepared using the methods of the present invention may also be designed to possess specific 
labels that may be detected through means known to one of ordinary skill In the art. The antibodies may also be designed 
to possess specific sequences useful for purification through means known to one of ordinary skill in the art. Specialty 
antibodies designed for binding specific antigens may also be made In transgenic animals using the transposon-based 

5 vectors of the present Invention. 

[0091] production of a monoclonal antibody using the transposon-based vectors of the present invention can be 
accomplished in a variety of ways. In one embodiment, two vectors may be constructed: one that encodes the light 
chain, and a second vector that encodes the heavy chain of the monoclonal antibody. These vectors may then be 
Incorporated into the genome of the target animal by methods disclosed herein. In an alternative embodiment, the 

'0 sequences encoding light and heavy chains of a monoclonal antibody may be included on a single DNA construct. For 
example, the coding sequence of light and heavy chains of a murine monoclonal antibody that show specificity for human 
seminoprotein can be expressed using transposon-based constructs of the present invention (GenBank Accession 
numbers AY1 29006 and AY1 29304 for the light and heavy chains, respectively). 

[0092] Further included in the present invention are proteins and peptides synthesized by the Immune system including 
15 those synthesized by the thymus, lymph nodes, spleen, and the gastrointestinal associated lymph tissues (G ALT) system. 

The immune system proteins and peptides proteins that can be made in transgenic animals using the transposon-based 
vectors of the present invention include, but are not limited to, aipha-interferon, beta-interferon, gamma-interferon, alpha- 
interferon A, alpha-interferon 1, G-CSF, GM-CSF, interlukin-1 (IL-1), IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL- 
1 1 , IL-1 2, IL-1 3, TNF-a, and TNF-p. Other cytokines included in the present invention include cardiotrophin, stromal cell 
20 derived factor, macrophage derived chemoklne (MDC), melanoma growth stimulatory activity (MGSA), macrophage 
inflammatory proteins 1 alpha (MIP-1 alpha), 2, 3 alpha, 3 beta, 4 and 5. 

[0093] Lytic peptides such as p146 are also included in the desired molecules of the present invention. In one em- 
bodiment, the p146 peptide comprises an amino acid sequence of SEQ ID NO:24. The present invention also encom- 
passes a transposon-based vector comprising a p146 nucleic acid comprising a polynucleotide sequence of SEQ ID 
25 NO:25. 

[0094] Enzymes are another class of proteins that may be made through the use of the transposon-based vectors of 
the present i nvention. Such enzymes include but are not limited to adenosine deaminase, alpha-galactosidase, cellu lase, 
collagenase, dnasel, hyaluronidase, lactase, L-asparaginase, pancreatin, papain, streptokinase B, subtllisin, superoxide 
dismutase, thrombin, trypsin, urokinase, fibrlnolysin, glucocerebrosidase and plasminogen activator. In some embodi- 
es ments wherein the enzyme could have deleterious effects, additional amino acids and a protease cleavage site are 
added to the carboxy end of the enzyme of Interest in order to prevent expression of a functional enzyme. Subsequent 
digestion of the enzyme with a protease results in activation of the enzyme. 

[0095] Extracellular matrix proteins are one class of desired proteins that may be made through the use of the present 
invention. Examples include but are not limited to collagen, fibrin, elastin, laminin, and fibronectln and subtypes thereof. 

35 Intracellular proteins and structural proteins are other classes of desired proteins in the present invention. 

[0096] Growth factors are another desired class of proteins that may be made through the use of the present invention 
and Include, but are not limited to, transforming growth factor-a ('TGF-a"), transforming growth factor-p (TGF-p), platelet- 
derived growth factors (PDGF), fibroblast growth factors (FGF), including FGF acidic isofomis 1 and 2, FGF basic form 
2 and FGF 4, 8, 9 and 10, nerve growth factors (NGF) including NGF 2.5s, NGF 7.0s and beta NGF and neurotrophins, 

40 brain derived neurotrophic factor, cartilage derived factor, growth factors for stimulation of the production of red blood 
cells, growth factors for stimulation of the production of white blood cells, bone growth factors (BGF), basic fibroblast 
growth factor, vascular endothelial growth factor (VEGF), granulocyte colony stimulating factor (G-CSF), insulin like 
growth factor (IGF) I and II, hepatocyte growth factor, glial neurotrophic growth factor (GDNF), stem cell factor (SCF), 
keratinocyte growth factor (KGF), transforming growth factors (TGF), including TGFs alpha, beta, betal, beta2, betaS, 

'f5 skeletal growth factor, bone matrix derived growth factors, bone derived growth factors, erythropoietin (EPO) and mixtures 
thereof. 

[0097] Another desired class of proteins that may be made may be made through the use of the present invention 
include, but are not limited to, leptin, leukemia inhibitory factor (LIF), tumor necrosis factor alpha and beta, ENBREL, 
angiostatin, endostatin, thrombospondin, osteogenic protein-1, bone morphogenetic proteins 2 and 7, osteonectin, so- 
so matomedin-like peptide, and osteocalcin. 

[0098] Yet another desired class of proteins are blood proteins or clotting cascade protein including albumin, Preka- 
llikrein. High molecular weight kininogen (IHiVIWK) (contact activation cofactor, Fitzgerald, Flaujeac Williams factor). 
Factor I (Fibrinogen), Factor II (prothrombin). Factor III (Tissue Factor), Factor IV (calcium), FactorV (proaccelerin, labile 
factor, accelerator (Ac-) globulin), Factor VI (Va) (accelerin). Factor VII (proconvertin), serum prothrombin conversion 
55 accelerator (SPCA), cothromboplastin). Factor VIII (antihemophiliac factor A, antihemophilic globulin (AHG)), Factor IX 
(Christmas Factor, antihemophilic factor B, plasma thromboplastin component (PTC)), Factor X (Stuart-Prower Factor), 
Factor XI (Plasma thromboplastin antecedent (PTA)), Factor XII (Hageman Factor), Factor XIII (rotransglutaminase, 
fibrin stabilizing factor (FSF), fibrinoligase), von Willebrand factor. Protein C, Protein S, Thrombomodulin, Antithrombin III. 
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[0099] A non-limiting list of the peptides and proteins that may be made may be made through the use of the present 
invention is provided In productcatalogsofcompanlessuch as Phoenix Phannaceuticals, Inc. (www.phoenixpeptlde.com; 
530 Harbor Boulevard, Belmont, CA), Peninsula Labs (San Carlos CA), SIGMA, (St.Louis, IViO www.sigma-aidrich.com), 
Cappel ICN (Irvine, California, www.icnbiomed.com), and Calbiochem (La Jolla, California, www.calbiochem.com). The 
5 polynucleotide sequences encoding these proteins and peptides of interest may be obtained from the scientific literature, 
from patents, and from databases such as GenBank. Alternatively, one of ordinary skill in the art may design the poiy- 
nucieotide sequence to be incorporated into the genome by choosing the codons that encode for each amino acid in 
the desired protein or peptide. 

[0100] Some of these desired proteins or peptides that may be made through the use of the present invention include 

'0 but are not limited to the following: adrenomedulin, amylin, calcitonin, amyloid, calcitonin gene-related peptide, chole- 
cystoklnln, gastrin, gastric Inhibitory peptide, gastrin releasing peptide, interleukin. Interferon, cortistatin, somatostatin, 
endothelin, sarafotoxin, glucagon, glucagon-like peptide, insulin, atrial natriuretic peptide, BNP, CNP, neurokinin, sub- 
stance P, ieptin, neuropeptide Y, melanin concentrating hormone, melanocyte stimulating hormone, orphanin, endorphin, 
dynorphin, enkephalin, leumorphin, peptide F, PACAP, PACAP-related peptide, parathyroid hormone, urocortin, corti- 

15 cotrophin releasing hormone, PHM, PHI, vasoactive intestinal polypeptide, secretin, ACTH, angiotensin, anglostatin, 
bombesin, endostatin, bradykinin, FMRF amide, galanin, gonadotropin releasing hormone (GnRH) associated peptide, 
GnRH, growth hormone releasing hormone, inhibin, granulocyte-macrophage colony stimulating factor (GM-GSF), mo- 
tiiin, neurotensin, oxytocin, vasopressin, osteocalcin, pancreastatin, pancreatic polypeptide, peptide YY, proopiomelano- 
cortin, transforming growth factor, vascular endothelial growth factor, vesicular monoamine transporter, vesicular ace- 

20 tylcholine transporter, ghrelin, NPW, NPB, C3d, prokinetican, thyroid stimulating hormones, luteinizing hormone, follicle 
stimulating hormone, prolactin, growth hormone, beta-lipotropin, melatonin, kallikriens, kinins, prostaglandins, erythro- 
poietin, pi 46 (SEQ ID NO:24, amino acid sequence, SEQ ID NO:25, nucleotide sequence), thymic hormones, connective 
tissue proteins, nuclear proteins, actin, avidin, activin, agrin, albumin, apolipoproteins, apolipoprotein A, apolipoprotein 
B, and prohormones, propeptides, splice variants, fragments and analogs thereof. 

25 [0101 ] Other desired proteins that may be made by the method of the present invention include bacitracin, polymixin 
b, vancomycin, cyclosporlne, anti-RSV antibody, alpha-1 antitrypsin (AAT), antl-cytomegalovirus antibody, anti-hepatitis 
antibody, anti-inhibitorcoagulant complex, anti-rabies antibody, anti-Rh(D) antibody, adenosine deaminase, anti-digoxin 
antibody, antlvenin crotalidae (rattlesnake venom antibody), antivenin latrodectus (black widow spider venom antibody), 
antlvenin micrurus (coral snake venom antibody), aprotinin, corticotropin (ACTH), diphtheria antitoxin, lymphocyte im- 

30 mune globulin (anti-thymocyte antibody), protamine, thyrotropin, capreomycin; a-galactosidase, gramicidin, streptoki- 
nase, tetanus toxoid, tyrothricin, IGF-1 , proteins of varicella vaccine, anti-TNF antibody, anti-IL-2r antibody, anti-HER- 
2 antibody, 0KT3 ("muromonab-CD3") antibody, TNF-IgG fusion protein, ReoPro ("abciximab") antibody, ACTH fragment 
1-24, desmopressin, gonadotropln-releasing homione, histrelln, leuprolide, lypressin, nafarelln, peptide that binds GPIIb/ 
GPIIIa on platelets (integrilin), goserelin, capreomycin, colistin, anti-respiratory syncytial virus, lymphocyte immune 

35 globulin (Thymoglovin, Atgam), panorex, alpha-antitrypsin, botulinin, lung surfactant protein, tumor necrosis receptor- 
IgG fusion protein (enbrel), gonadorelin, proteins of influenza vaccine, proteins of rotavirus vaccine, proteins of haemo- 
philus b conjugate vaccine, proteins of poliovirus vaccine, proteins of pneumococcal conjugate vaccine, proteins of 
meningococcal C vaccine, proteins of influenza vaccine, megakaryocyte growth and development factor (MGDF), neu- 
rolmmunophilln ligand-A (NIL-A), brain-derived neurotrophic factor (BDNF), glial cell line-derived neurotrophic factor 

40 (GDNF), leptin (native), Ieptin B, Ieptin C, IL-1RA (interleukin-IRA), R-568, novel erythropoiesis-stlmulating protein 
(NESP), humanized mAb to rous sarcoma virus (MEDI-493), glutamyl-tryptophan dipeptide IM862, LFA-3TIP immuno- 
suppressive, humanized anti-CD40-ligand monoclonal antibody (5c8), gelsonin enzyme, tissue factor pathway inhibitor 
(TFPI), proteins of meningitis B vaccine, antimetastatic cancer antibody (mAb 1 7-1 A), chimeric (human & mouse) mAb 
against TNFa, mAb against factor VII, relaxin, capreomycin, glycopeptide (LY333328), recombinant human activated 

'fs protein C (rhAPC), humanized mAb against the epidermal growth receptor-2, altepase, anti-CD20 antigen, C2B8 anti- 
body, insulin-like growth factor-1 , atrial natriuretic peptide (anaritide), tenectaplase, anti-CD1 1 a antibody (hull 24), anti- 
CD1 8 antibody, mAb LDP-02, anti-VEGF antibody, fab fragment of anti-VEGF Ab, AP02 ligand (tumor necrosis factor- 
related apoptosis-inducing ligand), rTGF-p (transforming growth factor-p), alpha-antitrypsin, ananain (a pineapple en- 
zyme), humanized mAb CTLA4IG, PRO 542 (mAb), D2E7 (mAb), calf intestine alkaline phosphatase, a-L-iduronidase, 

50 a-L-galactosidase (humanglutamic acid decarboxylase, acid sphingomyelinase, bone morphogenetic protein-2 (rhBMP- 
2), proteins of HiV vaccine, T cell receptor (TOR) peptide vaccine, TCR peptides, V beta 3 and V beta 13.1. (IR502), 
(InSOl), 81 1050/1272 mAb against very late antigen-4 (VLA-4), C225 humanized mAb to EGF receptor, anti-idiotype 
antibody to GD3 glycolipid, antibacterial peptide against H. pylori, MDX-AA7 bispecific humanized mAb to EGF receptor, 
anti-cytomegalovirus (CMV), Medi-491 B1 9 parvovirus vaccine, humanized recombinant mAb (IgGlk) against respiratory 

55 syncytial virus (RSV), urinary tract infection vaccine (against "pill" on Escherechia con strains), proteins of lyme disease 
vaccine against 6. burgdorferi protein (DbpA), proteins of lVledi-501 human papilloma virus-1 1 vaccine (HPV), Strepto- 
coccus pneumoniae vaccine, Medi-507 mAb (humanized forni of BTI-322) against CD2 receptor on T-cells, MDX-33 
mAb to FcyRI receptor, MDX-RA immunotoxin (ricin A linked) mAb, l\/IDX-210 bi-speclfic mAb against HER-2, MDX- 
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447 bi-specific mAb against EGF receptor, MDX-22, MDX-220 bi-specific mAb against TAG-72 on tumors, colony- 
stimulating factor (CSF) (molgramostim), Inumanized mAb to the IL-2 R a-chain (basiliximab), mAb to IgE (IGE 025A), 
myelin basic protein-altered peptide (MSP771A), humanized mAb against the epidermal growth receptor-2, humanized 
mAb against the a subunit of the interleukln-2 receptor, low molecular weight heparin, anti-hemophillic factor, and 

5 bactericidal/penneability-increasiag protein (r-BPI). 

[01 02] The peptides and proteins made using the present invention may be labeled using labels and techniques l<nown 
to one of ordinary skill in the art. Some of these labels are described in the "Handbook of Fluorescent Probes and 
Research Products", ninth edition, Richard P. Haugland (ed) Molecular Probes, Inc. Eugene, OR), which is incorporated 
herein in its entirety. Some of these labels may be genetically engineered into the polynucleotide sequence for the 

'0 expression of the selected protein or peptide. The peptides and proteins may also have label-incorporation "handles" 
incorporated to allow labeling of an otherwise difficult or impossible to label protein. 

[0103] It is to be understood that the various classes of desired peptides and proteins, as well as specific peptides 
and proteins described in this section may be modified as described below by Inserting selected cordons for desired 
amino acid substitutions into the gene incorporated Into the transgenic animal. 
15 [0104] Also, reference Is made to the production of molecules other than proteins and peptides including, but not 

limited to, lipoproteins such as high density lipoprotein (HDL), HDL-Milano, and low density lipoprotein, lipids, carbohy- 
drates, siRNA and ribozymes. In such cases, a gene of interest encodes a nucleic acid molecule or a protein that directs 
production of the desired molecule. 

[0105] Further, reference is made to the use of inhibitory molecules to inhibit endogenous (i.e., non-vector) protein 
20 production. These inhibitory molecules Include antisense nucleic acids, siRNA and inhibitory proteins. 

The endogenous protein whose expression is inhibited may be an eggwhite protein including, but not limited to ovalbumin, 

ovotransferrin, and ovomucin. 

[0106] A transposon -based vector containing an ovalbumin DNA sequence, that upon transcription forms a double 
stranded RNA molecule, may be transfected into an animal such as a bird and the bird's production of endogenous 

25 ovalbumin protein is reduced by the interference RNA mechanism (RNAi). 

A transposon-based vector may encode an inhibitory RNA molecule that inhibits the expression of more than one egg 
white protein. One exemplary construct is provided in Figure 4 wherein "Ovgen" indicates approximately 60 base pairs 
of an ovalbumin gene, "Ovotrans" indicates approximately 60 base pairs of an ovotransferrin gene and "Ovomucin" 
indicates approximately 60 base pairs of an ovomucin gene. These ovalbumin, ovotransferrin and ovomucin can be 

30 from any avian species, and in some cases, are from a chicken or quail. The term "pro" indicates the pro portion of a 
prepro sequence. One exemplary prepro sequence is that of cecropin and comprising base pairs 563-733 of the Cecropin 
cap site and Prepro provided in Genbank accession number X07404. Additional cecropin prepro and pro sequences 
are provided In SEQ ID N0:4S, SEQ ID N0:4g, SEQ ID NO:50, and SEQ ID N0:51 . Additionally, Inducible knockouts 
or knockdowns of the endogenous protein may be created to achieve a reduction or inhibition of endogenous protein 

35 production. Endogenous egg white production can be inhibited in an avian at anytime, but is preferably inhibited preceding, 
or immediately preceding, the harvest of eggs. 

Modified Desired Proteins and Peptides 

40 [0107] "Proteins", "peptides," "polypeptides" and "oligopeptides" are chains of amino acids (typically L-amino acids) 
whose alpha carbons are linked through peptide bonds formed by a condensation reaction between the carboxyl group 
of the alpha carbon of one amino acid and the amino group of the alpha carbon of another amino acid. The terminal 
amino acid at one end of the chain (i.e., the amino terminal) has a free amino group, while the temiinal amino acid at 
the other end of the chain (i.e., the carboxy terminal) has a free carboxyl group. As such, the term "amino terminus" 

'fs (abbreviated N-terminus) refers to the free alpha-amino group on the amino acid at the amino terminal of the protein, 
or to the alpha-amino group (imino group when participating in a peptide bond) of an amino acid at any other location 
within the protein. Similarly, the term "cariDoxy terminus" (abbreviated C-terminus) refers to the free carboxyl group on 
the amino acid at the carboxy tenninus of a protein, or to the carboxyl group of an amino acid at any other location within 
the protein. 

50 [0108] Typically, the amino acids making up a protein are numbered in order, starting at the amino tenninal and 
increasing in the direction toward the carboxy terminal of the protein. Thus, when one amino acid is said to "follow" 
another, that amino acid is positioned closer to the carboxy tenninal of the protein than the preceding amino acid. 
[0109] The term "residue" is used herein to refer to an amino acid (D or L) or an amino acid mimetic that is incorporated 
into a protein by an amide bond. As such, the amino acid may be a naturally occurring amino acid or, unless otherwise 

55 limited, may encompass known analogs of natural amino acids that function in a mannersimilar to the naturally occurring 
amino acids (i.e., amino acid mimetics). Moreover, an amide bond mimetic includes peptide backbone modifications 
well known to those skilled in the art. 

[01 1 0] Furthermore, one of skill will recognize that, as mentioned above, individual substitutions, deletions or additions 
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which alter, add or delete a single amino acid or a small percentage of amino acids (typically less than about 5%, more 
typically less than about 1 %) in an encoded sequence are conservatively modified variations where the alterations result 
in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing 
functionally similar amino acids are well known in the art. The following six groups each contain amino acids that are 
conservative substitutions for one another: 



1) Alanine (A), Serine (S), Threonine (T); 

2) Aspartic acid (D), Glutamic acid (B); 

3) Asparagine (N), Glutamine (Q); 

4) Arginine (R), Lysine (K); 

5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); and 

6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). 



[0111] A conservative substitution is a substitution in which the substituting amino acid (naturally occurring or modified) 
is structurally related to the amino acid being substituted, i.e., has about the same size and electronic properties as the 

amino acid being substituted. Thus, the substituting amino acid would have the same or a similar functional group in the 
side chain as the original amino acid. A "conservative substitution" also refers to utilizing a substituting amino acid which 
is Identical to the amino acid being substituted except that a functional group in the side chain is protected with a suitable 
protecting group. 

[0112] Suitable protecting groups are described in Green and Wuts, "Protecting Groups in Organic Synthesis", John 

Wiley and Sons, Chapters 5 and 7, 1 991 , the teachings of which are incorporated herein by reference. Preferred protecting 
groups are those which facilitate transport of the peptide through membranes, for example, by reducing the hydrophilicity 
and increasing the lipophilicity of the peptide, and which can be cleaved, either by hydrolysis or enzymatically (Ditter et 
al., 1 968. J. Pharm. Sci. 57:783; Ditter et al., 1 968. J. Phann. Sci. 57:828; Ditter et al. , 1 969. J. Pharm. Sci. 58:557; King 
etal., 1987. Biochemistry 26:2294; Lindberg etal., 1989. Drug Metabolism and Disposition 17:31 1;Tunel<etaL, 1988. 
Biochem. Pharm. 37:3867; Anderson et al., 1985 Arch. Blochem. Biophys. 239:538; and Singhal et al., 1987. FASEB 
J. 1 :220). Suitable hydroxyl protecting groups include ester, cariDonate and carbamate protecting groups. Suitable amine 
protecting groups include acyl groups and alkoxy or aryloxy carbonyl groups, as described above for N-terminal protecting 
groups. Suitable carboxylic acid protecting groups include aliphatic, benzyl and aryl esters, as described below for C- 
terminal protecting groups. In one embodiment, the carboxylic acid group in the side chain of one or more glutamic acid 
or aspartic acid residues in a peptide of the present invention is protected, preferably as a methyl, ethyl, benzyl or 
substituted benzyl ester, more preferably as a benzyl ester. 

[0113] Provided below are groups of naturally occurring and modified amino acids in which each amino acid in a group 
has similar electronic and steric properties. Thus, a conservative substitution can be made by substituting an amino acid 
with another amino acid from the same group. It is to be understood that these groups are non-limiting, i.e. that there 
are additional modified amino acids which could be included in each group. 



Group I includes leucine, isoleucine, valine, methionine and modified amino acids having the following side chains: 
ethyl, n-propyl n-butyl. Preferably, Group I includes leucine, isoleucine, valine and methionine. 

Group II includes glycine, alanine, valine and a modified amino acid having an ethyl side chain. Preferably, Group 
II includes glycine and alanine. 

Group III includes phenylalanine, phenylglycine, tyrosine, tryptophan, cyclohexylmethyl glycine, and modified amino 
residues having substituted benzyl or phenyl side chains. Preferred substituents include one or more of the 
following: halogen, methyl, ethyl, nitro, -NHg, methoxy, methoxy and - ON. Preferably, Group III includes 
phenylalanine, tyrosine and tryptophan. 

Group IV includes glutamic acid, aspartic acid, a substituted or unsubstltuted aliphatic, aromatic or benzylic ester of 
glutamic or aspartic acid (e.g., methyl, ethyl, n-propyl iso-propyl, cyclohexyl, benzyl or substituted benzyl), 
glutamine, asparagine, -CO-NH- alkylated glutamine or asparagines (e.g., methyl, ethyl, n-propyl and iso- 
propyl) and modified amino acids having the side chain -(CH2)3-COOH, an ester thereof (substituted or 
unsubstltuted aliphatic, aromatic or benzylic ester), an amide thereof and a substituted or unsubstltuted N- 
alkylated amide thereof. Preferably, Group IV includes glutamic acid, aspartic acid, methyl aspartate, ethyl 
aspartate, benzyl aspartate and methyl glutamate, ethyl glutamate and benzyl glutamate, glutamine and 
asparagine. 

Group V includes histidine, lysine, ornithine, arginine, N-nitroarginine, p-cycloarginine, y-hydroxyarglnine, N-amidi- 
nocitruline and 2-amino-4-guanidlnobutanoicacid, homologs of lysine, homologs of arginine and homologs 
of ornithine. Preferably, Group V includes histidine, lysine, arginine and ornithine. A homolog of an amino 
acid includes from 1 to about 3 additional or subtracted methylene units in the side chain. 

Group VI includes serine, threonine, cysteine and modified amino acids having 01 -05 straight or branched alkyi side 
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chains substituted with -OH or -SH, for example, -CHgCHgOH, -CHgCHgCHgOH or -CHgCHgOHCHg. Pref- 
erably, Group VI includes serine, cysteine or threonine. 

[0114] In another aspect, suitable substitutions for amino acid residues include "severe" substitutions. A "severe 
substitution" is a substitution in which the substituting amino acid (naturally occurring or modified) has significantly 
different size and/or electronic properties compared with the amino acid being substituted. Thus, the side chain of the 
substituting amino acid can be significantly larger (or smaller) than the side chain of the amino acid being substituted 
and/or can have functional groups with significantly different electronic properties than the amino acid being substituted. 
Examples of severe substitutions of this type include the substitution of phenylalanine or cyclohexylmethyl glycine for 
alanine, isoleucine for glycine, a D amino acid for the corresponding L amino acid, or -NH-CH[{-CH2)5-COOH]-CO- for 
aspartic acid. Alternatively, afunctional group maybe added to the side chain, deleted from the side chain or exchanged 
with anotherfunctlonal group. Examples of severe substitutions of this type include adding of valine, leucine or Isoleucine, 
exchanging the carboxylic acid in the side chain of aspartic acid or glutamic acid with an amine, or deleting the amine 
group in the side chain of lysine or ornithine. In yet another alternative, the side chain of the substituting amino acid can 
have significantly different steric and electronic properties that the functional group of the amino acid being substituted. 
Examples of such modifications include tryptophan for glycine, lysine for aspartic acid and - (CH2)4COOH for the side 
chain of serine. These examples are not meant to be limiting. 

[01 15] In another embodiment, for example in the synthesis of a peptide 26 amino acids in length, the individual amino 
acids may be substituted according in the following manner. 

AA.| is serine, glycine, alanine, cysteine or threonine; 
AA2is alanine, threonine, glycine, cysteine or serine; 

AAg is valine, arginlne, leucine, Isoleucine, methionine, ornithine, lysine, N-nitroarginine, P-cycloarginine, y-hy- 

droxyarginine, N-amidlnocitruline or 2-amlno-4-guanidinobutanoic acid; 

AA4 is proline, leucine, valine, isoleucine or methionine; 

AA5 is tryptophan, alanine, phenylalanine, tyrosine or glycine; 

AAg is serine, glycine, alanine, cysteine or threonine; 

AA7 is proline, leucine, valine, isoleucine or methionine; 

AAg is alanine, threonine, glycine, cysteine or serine; 

AAg is alanine, threonine, glycine, cysteine or serine; 

AA^Q is leucine, isoleucine, methionine or valine; 

AA^i is serine, glycine, alanine, cysteine or threonine; 

AA^2 leucine, isoleucine, methionine or valine; 

AA13 is leucine, isoleucine, methionine or valine; 

AA.,4 is glutamlne, glutamic acid, aspartic acid, asparagine, or a substituted or unsubstituted aliphatic or aryl ester 
of glutamic acid or aspartic acid; 

AA^g is arginine, N-nitroarginine, |5-cycloarginine, v-hydroxy-arginine, N-amidinocltruline or 2-amino-4-guanidlno- 
butanoic acid 

AA^e is proline, leucine, valine, isoleucine or methionine; 
AA17 is serine, glycine, alanine, cysteine or threonine; 

AA^g is glutamic acid, aspartic acid, asparagine, glutamine or a substituted or unsubstituted aliphatic or aryl ester 
of glutamic acid or aspartic acid; 

AA^g is aspartic acid, asparagine, glutamic acid, glutamlne, leucine, valine, isoleucine, methionine or a substituted 

or unsubstituted aliphatic or aryl ester of glutamic acid or aspartic acid; 

AA20 is valine, arginine, leucine, isoleucine, methionine, omithine, lysine, N-nitroarginlne, |J-cycloarglnlne, y-hy- 
droxyarginine, N-amidinocitrullne or 2-amlno-4-guanidlnobutanoic acid; 
AA21 is alanine, threonine, glycine, cysteine or serine; 
AA22 is alanine, threonine, glycine, cysteine or serine; 

AA23 is histidine, serine, threonine, cysteine, lysine or ornithine; 

AA24 is threonine, aspartic acid, serine, glutamic acid or a substituted or unsubstituted aliphatic or aryl ester of 
glutamic acid or aspartic acid; 

AA25 is asparagine, aspartic acid,, glutamic acid, glutamlne, leucine, valine, isoleucine, methionine or a substituted 
or unsubstituted aliphatic or aryl ester of glutamic acid or aspartic acid; and 
AAgg is cysteine, histidine, serine, threonine, lysine or ornithine. 

[0116] It Is to be understood that these amino acid substitutions maybe made for longer or shorter peptides than the 
26 mer in the preceding example above, and for proteins. 

[01 17] In one embodiment of the present invention, codons for the fust several N-terminal amino acids of the trans- 
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posase are modified such ttiat the third base of each cordon is changed to an A or aT without changing the corresponding 
amino acid. It is preferable that between approximately 1 and 20, more preferably 3 and 1 5, and most preferably between 
4 and 12 of the first N-terminal codons of the gene of interest are modified such that the third base of each codon is 
changed to an A or a T without changing the corresponding amino acid. In one embodiment, the first ten N-terminal 

5 codons of the gene of interest are modified in this manner. 

[0118] When several desired proteins, protein fragments or peptides are encoded in the gene of interest to be incor- 
porated into the genome, one of skill in the art will appreciate that the proteins, protein fragments or peptides may be 
separated by a spacer molecule such as, for example, a peptide, consisting of one or more amino acids. Generally, the 
spacer will have no specific biological activity other than to join the desired proteins, protein fragments or peptides 

'0 together, or to preserve some minimum distance or other spatial relationship between them. However, the constituent 
amino acids of the spacer may be selected to influence some property of the molecule such as the folding, net charge, 
or hydrophobicity. The spacer may also be contained within a nucleotide sequence with a purification handle or be 
flanked by cleavage sites, such as proteolytic cleavage sites. 

[01 1 9] Such polypeptide spacers may have from about 5 to about 40 amino acid residues. The spacers In a polypeptide 
15 are independently chosen, but are preferably all the same. The spacers should allow for flexibility of movement in space 

and are therefore typically rich in small amino acids, for example, glycine, serine, proline or alanine. Preferably, peptide 
spacers contain at least 60%, more preferably at least 80% glycine or alanine. In addition, peptide spacers generally 
have little or no biological and antigenic activity. Preferred spacers are (Gly-Pro-Gly-Gly)>, (SEQ ID NO:26) and 
(Gly4-Ser)y, wherein x is an integerfrom about 3 to about 9 and y is an integer from about 1 to about 8. Specific examples 
20 of suitable spacers include 
(Gly-Pro-Gly-Gly)3 

SEQ ID NO:27 Gly Pro Gly Gly Gly Pro Gly Gly Gly Pro Gly Gly 

(Gly4-Ser)3 

SEQ ID NO:28 Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 

25 or (Gly4-Ser)4 

SEQ ID NO:29 Gly Gly Gly Gfy Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 
3P Gly Gly Gly Gly Ser. 

[0120] Nucleotide sequences encoding for the production of residues which may be useful In purification of the ex- 
pressed recombinant protein may also be built into the vector. Such sequences are known in the art and include then 
glutathione binding domain from glutathione S-transferase, polylysine, hexa-histidine or other cationic amino acids, 

35 thioredoxin, hemagglutinin antigen and maltose binding protein. 

[0121] Additionally, nucleotide sequences may be inserted into the gene of interest to be incorporated so that the 
protein or peptide can also include from one to about six amino acids that create signals for proteolytic cleavage. In this 
manner, if a gene is designed to make one or more peptides or proteins of interest in the transgenic animal, specific 
nucleotide sequences encoding for amino acids recognized by enzymes may be incorporated into the gene to facilitate 

40 cleavage of the large protein or peptide sequence into desired peptides or proteins or both. For example, nucleotides 
encoding a proteolytic cleavage site can be introduced into the gene of interest so that a signal sequence can be cleaved 
from a protein or peptide encoded by the gene of interest. Nucleotide sequences encoding other amino acid sequences 
which display pH sensitivity or chemical sensitivity may also be added to the vector to facilitate separation of the signal 
sequence from the peptide or protein of interest. 

'f5 [0122] Proteolytic cleavage sites include cleavage sites recognized by exopeptidases such as carboxypeptidase A, 
carboxypeptidase B, aminopeptidase I, and dipeptidylaminopeptidase; endopeptidases such as trypsin, V8-protease, 
enterokinase, factor Xa, collagenase, endoproteinase, subtilisin, and thombin; and proteases such as Protease 3C IgA 
protease (Igase) Rhinovirus 3C(preScission)protease. Chemical cleavage sites are also included in the definition of 
cleavage site as used herein. Chemical cleavage sites include, but are not limited to, site cleaved by cyanogen bromide, 

50 hydroxylamine, formic acid, and acetic acid. 

[0123] In one embodiment of the present invention, a TAG sequence is linked to the gene of interest. The TAG 
sequence serves three purposes: 1) it allows free rotation of the peptide or protein to be isolated so there is no interference 
from the native protein or signal sequence, i.e. vitellogenin, 2) it provides a "purification handle" to isolate the protein 
using column purification, and 3) it includes a cleavage site to remove the desired protein from the signal and purification 

55 sequences. Accordingly, as used herein, a TAG sequence includes a spacer sequence, a purification handle and a 
cleavage site. The spacer sequences in the TAG proteins contain one or morse repeats shown in SEQ ID NO:30. A 
preferred spacer sequence comprises the sequence provided in SEQ ID N0:31. One example of a purification handle 
is the gp41 hairpin loop from HIV I. Exemplary gp41 polynucleotide and polypeptide sequences are provided in SEQ ID 
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NO:32 and SEQ ID NO:33, respectively. However, it stiould be understood that any antigenic region may be used as a 
purification handle, including any antigenic region of gp41 . Preferred purification handles are those that elicit highly 
specific antibodies. Additionally, the cleavage site can be any protein cleavage site known to one of ordinary skill in the 
art and includes an enterokinase cleavage site comprising the Asp Asp Asp Asp Lys sequence (SEQ ID NO:34) and a 
5 furin cleavage site. Constructs containing a TAG sequence are shown in Figures 2 and 3. In one embodiment of the 
present Invention, the TAG sequence comprises a polynucleotide sequence of SEQ ID NO:35. 

Methods of Administering Transposon-Based Vectors 

10 [0124] The present invention includes methods of administering the transposon-based vectors to a bird. The present 

invention makes also reference to methods of producing a transgenic animal wherein a gene of interest is incorporated 
into the germline of the animal and methods of producing a transgenic animal wherein a gene of interest is incorporated 
into cells otherthan the germline cells (somatic eel Is) of the animal. The transposon-based vectors of the present invention 
axe administered to an oviduct or an ovary via any method known to those of skill in the art. According to present claim 

15 1 reproductive organ means an oviduct or an ovary. 

[0125] In some embodiments, a transposon-based vector is directly administered to the oviduct or ovary. Direct ad- 
ministration encompasses injection into the organ, and in a preferred embodiment; a transposon-based vector is injected 
into the lumen of the oviduct, and more preferably, the iumen of the magnum or the infundibulum of the oviduct. The 
transposon-based vectors may additionally or alternatively be placed in an artery supplying the reproductive organ. 

20 Administering the vectors to the artery supplying the ovary results in transfection of follicles and oocytes in the ovary to 
create a germline transgenic animal. Alternatively, supplying the vectors through an artery leading to the oviduct would 
preferably transfect the tubular gland and epithelial cells. Such transfected cells could manufacture a desired protein or 
peptide for deposition in the egg white. In one embodiment, a transposon-based vector is administered into the lumen 
of the magnum or the infundibulum of the oviduct and to an artery supplying the oviduct. Indirect administration to the 

25 oviduct epithelium may occur through the cloaca. Direct administration into the mammary gland comprises introduction 
into the duct system of the mammary gland. 

[0126] Administration of transposon-based vectors may occur in arteries supplying the ovary and or through direct 
intrathecal administration into the ovary through injection. 

[0127] The transposon-based vectors may be administered in a single administration, multiple administrations, con- 
so tinuously, or intermittently. The transposon-based vectors may be administered by injection, via a catheter, an osmotic 
mini-pump or any other method. In some embodiments, the transposon-based vector is administered to an animal in 
multiple administrations, each administration containing the vector and a different transfecting reagent. 
[0128] The transposon-based vectors may be administered to the bird at any point during the lifetime of the bird 
however, it is preferable that the vectors are administered prior to the binj reaching sexual maturity. The transposon- 
35 based vectors are preferably administered to a chicken between approximately 14 and 16 weeks of age and to a quail 
between approximately 5 and 10 weeks of age, more preferably 5 and 8 weeks of age, and most preferably between 5 
and 6 weeks of age, when standard poultry rearing practices are used. The vectors may be administered at earlier ages 
when exogenous homnones are used to induce eariy sexual maturation in the bird. In some embodiments, the transposon- 
based vector is administered to an bird following an increase in proliferation of the oviduct epithelial cells and/or the 
40 tubular gland cells. Such an increase in proliferation nonnally follows an influx of reproductive hormones in the area of 
the oviduct. When the bird is an avian, the transposon-based vector is administered following an increase in proliferation 
of the oviduct epithelial cells and before the avian begins to produce egg white constituents. 
[0129] In a preferred embodiment, the bird is an avian. In one embodiment, between approximately 1 and 150 (j,g, 1 
and 1 00 ng, 1 and 50 ixg, preferably between 1 and 20 (jig, and more preferably between 5 and 1 0 ixg of transposon- 
'f5 based vector DNA is administered to the oviduct of a bird. Optimal ranges depend upon the type of bird and the bird's 
stage of sexual maturity. In a chicken, it is preferred that between approximately 1 and 100 /xg, or 5 and 50 ng are 
administered. In a quail, it is preferred that between approximately 5 and 1 0 |j,g are administered. Intraoviduct admin- 
istration of the transposon-based vectors of the present invention result in incorporation of the gene of interest into the 
cells of the oviduct as evidenced by a PGR positive signal in the oviduct tissue. In other embodiments, the transposon- 
50 based vector is administered to an artery that supplies the oviduct. These methods of administration may also be 
combined with any methods for facilitating transfection, including without limitation, electroporation, gene guns, injection 
of naked DNA, and use of dimethyl sulfoxide (DMSO). 

[01 30] According to the present invention, the transposon-based vector is administered in conjunction with an accept- 
able carrier and/or transfection reagent. Acceptable carriers include, but are not limited to, water, saline. Hanks Balanced 
55 Salt Solution (HBSS), Tris-EDTA (TE) and lyotropic liquid crystals. Transfection reagents commonly known to one of 
ordinary skill In the art that may be employed include, but are not limited to, the following: catlonic lipid transfection 
reagents, cationic lipid mixtures, polyamine reagents, liposomes and combinations thereof; SUPERFECT®, Cytofectene, 
BioPORTER®, GenePORTER®, NeuroPORTER®, and perfectin from Gene Therapy Systems; lipofectamine, cellfectin. 
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DMRIE-Coligofectamine,TROJENE®and PLUS reagent from InVitrogen;. Xtreme gene, fugene, DOSPERand DOTAP 
from Roche; Lipotaxi and Genejammerfrom Strategene; and Escort from SIGMA. In one embodiment, the transfection 
reagent is SUPERPECT®. The ratio of DNA to transfection reagent may vary based upon the, method of administration. 
In one embodiment, the transposon-based vector is administered to the oviduct and the ratio of DNA to transfection 
5 reagent can be from 1 :1 .5 to 1 :1 5, preferably 1 :2 to 1 :5, all expressed as wt/vol. Transfection may also be accomplished 
using other means known to one of ordinary skill in the art, including without limitation electroporation, gene guns, 
injection of naked DNA, and use of dimethyl sulfoxide (DMSO). 

[0131] Depending upon the cell or tissue type targeted for transfection, the form of the transposon-based vector may 
be important. Plasmids harvested from bacteria are generally closed circular supercoiled molecules, and this Is the 
'0 preferred state of a vector for gene delivery because of the ease of preparation. In some instances, transposase ex- 
pression and insertion may be more efficient in a relaxed, closed circular configuration or in a linear configuration. In 
still other instances, a purified transposase protein may be co-injected with a transposon-based vector containing the 
gene of interest for more immediate insertion. This could be accomplished by using a transfection reagent complexed 
with both the purified transposase protein and the transposon-based vector. 

15 

Testing for and Breeding Animals Carrying the Transgene 

[0132] Following administration of a transposon-based vector to an bird, DNA is extracted from the bird to confirm 
integration of the gene of interest. Advantages provided by the present invention include the high rates' of integration, 

20 or incorporation, and transcription of the gene of interest when administered to a bird via an intraoviduct or intraovarian 
route (including intraarterial administrations to arteries leading to the oviduct or ovary). Example 6 below describes 
isolation of a proinsulin/ENTTAG protein from a transgenic hen following ammonium sulfate precipitation and ion ex- 
change chromatography. Figure 5 demonstrates successful administration of a transposon-based vector to a hen, suc- 
cessful integration of the gene of interest, successful production of a protein encoded by the gene of interest, and 

25 successful deposition of the protein in egg white produced by the transgenic hen. 

[0133] Actual frequencies of integration may be estimated both by comparative strength of the PCR signal, and by 
histological evaluation of the tissues by quantitative PCR. Another method for estimating the rate of transgene insertion 
is the so-called primed in situ hybridization technique (PRINS). This method determines not only which cells cany a 
transgene of interest, but also into which chromosome the gene has inserted, and even what portion of the chromosome. 

30 Briefly, labeled primers are annealed to chromosome spreads (affixed to glass slides) through one round of PCR, and 
the slides are then developed through normal in situ hybridization procedures. This technique combines the best features 
of, in situ PCR and fluorescence in situ hybridization (FISH) to provide distinct chromosome location and copy number 
of the gene in question. 

[0134] Breeding experiments are also conducted to detennine if germline transmission of the transgene has occurred. 

35 In a general bird breeding experiment performed according to the present invention, each male bird was exposed to 2-3 
different adult female birds for 3-4 days each. This procedure was continued with different females for a total period of 
6-1 2 weeks. Eggs are collected daily for up to 14 days after the last exposure to the transgenic male, and each egg is 
incubated in a standard incubator. The resulting embryos are examined for transgene presence at day 3 or 4 using PCR. 
It is to be understood that the above procedure can be modified to suit animals other than birds and that selective 

40 breeding techniques may be performed to amplify gene copy numbers and protein output. 

Production of Desired Proteins or Peptides in Egg White 

[0135] In one embodiment, the transposon-based vectors of the present invention may be administered to a bird for 
'fs production of desired proteins or peptides in the egg white. These transposon-based vectors preferably contain one or 
more of an ovalbumin promoter, an ovomucoid promoter, an ovalbumin signal sequence and an ovomucoid signal 
sequence. Oviduct-specific ovalbumin promoters are described in B. O'Malley etal., 1987. EMBO J., vol. 6. pp. 2305-12; 
A. Qiu etal., 1994. Proc. Nat. Acad. Sci. (USA), vol. 91, pp. 4451-4455; D. Monroe etal., 2000. Biochim. Biophys. Acta, 
1517 (1):27-32; H. Park et al., 2000. Biochem., 39:8537-8545; and T. Muramatsu et al., 1996. Poult. Avian BioL Rev., 
50 6:107-123. Examples of transposon-based vectors designed for production of a desired protein in an egg white are 
shown in Figures 2 and 3. 

Production of Desired Proteins or Peptides in Egg Yolk 

55 [0136] The present invention is particularly advantageous for production of recombinant peptides and proteins of low 
solubility in the egg yolk. Such proteins include, but are not limited to, membrane-associated or membrane-bound 
proteins, lipophilic compounds; attachment factors, receptors, and components of second messenger transduction ma- 
chinery. Low solubility peptides and proteins are particularly challenging to produce using conventional recombinant 
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protein production techniques (cell and tissue cultures) because they aggregate in water-based, hydrophilic environ- 
ments. Such aggregation necessitates denaturation and re-folding of the recombinantly-produced proteins, which may 
deleterlously affect their structure and function. Moreover, even highly soluble recombinant peptides and proteins may 
precipitate and require denaturation and renatu ration when produced in sufficiently high amounts in recombinant protein 
5 production systems. The present invention provides an advantageous resolution of the problem of protein and peptide 
solubility during production of large amounts of recombinant proteins. 

[0137] In one embodiment of the present invention wherein gennline transfection is obtained via intraovarian admin- 
istration of the transposon-based vector, deposition of a desired protein into the egg yolk is accomplished in offspring 
by attaching a sequence encoding a protein capable of binding to the yolk vitellogenin receptor to a gene of interest that 

'0 encodes a desired protein. This transposon-based vector can be used forthe receptor-mediated uptake of the desired 
protein by the oocytes. In a preferred embodiment, the sequence ensuring the binding to the vitellogenin receptor is a 
targeting sequence of a vitellogenin protein. The invention encompasses various vitellogenin proteins and their targeting 
sequences. In a preferred embodiment, a chicken vitellogenin protein targeting sequence is used, however, due to the 
high degree of conservation among vitellogenin protein sequences and known cross-species reactivity of vitellogenin 

15 targeting sequences with their egg-yolk receptors, other vitellogenin targeting sequences can be substituted. One ex- 
ample of a construct for use in the trensposon-based vectors of the present invention and for deposition of an insulin 
protein in an egg yolk is a transposon-based vector containing a vitellogenin promoter, a vitellogenin targeting sequence, 
a TAG sequence, a pro-insulin sequence and a synthetic polyA sequence. The present invention includes, but is not 
limited to, vitellogenin targeting sequences residing in the N-temiinal domain of vitellogenin, particularly in lipovitellin I. 

20 In one embodiment, the vitellogenin targeting sequence contains the polynucleotide sequence of SEQ ID NO:22. In a 
preferred embodiment, the transposon-based vector contains a transposase gene operably-linked to a constitutive 
promoter and a gene of interest operably-linked to a liver-specific promoter and a vitellogenin targeting sequence. 

Isolation and Purification of Desired Protein or Peptide 

25 

[0138] For large-scale production of protein, an bird breeding stock that is homozygous for the transgene is preferred. 
Such homozygous individuals are obtained and identified through, for example, standard animal breeding procedures 
or PGR protocols. 

[0139] Once expressed, peptides, polypeptides and proteins can be purified according to standard procedures known 
30 to one of orclinary skill in the art, including ammonium sulfate precipitation, affinity columns, column chromatography, 
gel electrophoresis, high performance liquid chromatography, immunoprecipitation and the like. Substantially pure com- 
positions of about 50 to 99% homogeneity are preferred, and 80 to 95% or greater homogeneity are most preferred for 
use as therapeutic agents. 

[0140] In one embodiment of the present invention, the bird in which the desired protein is produced is an egg-laying 

35 bird. In a preferred embodiment of the present invention, the animal is an avian and a desired peptide, polypeptide or 
protein is isolated from an egg white. Egg white containing the exogenous protein or peptide is separated from the yolk 
and other egg constituents on an industrial scale by any of a variety of methods known in the egg industry. See, e.g., 
W. Stadelman et al. (Eds.), Egg Science & Technology, Haworth Press, Binghamton, NY (1995). Isolation of the exog- 
enous peptide or protein from the other egg white constituents is accomplished by any of a number of polypeptide 

40 isolation and purification methods well known to one of ordinary skill in the art. These techniques include, for example, 
chromatographic methods such as gel permeation, ion exchange, affinity separation, metal chelation, HPLC, and the 
like, either alone or in combination. Another means that may be used for isolation or purification, either in lieu of or in 
addition to chromatographic separation methods, includes electrophoresis. Successful isolation and purification is con- 
firmed by standard analytic techniques, including HPLC, mass spectroscopy, and spectrophotometry. These separation 

'fs methods are often facilitated if the first step in the separation is the removal of the endogenous ovalbumin fraction of 
egg white, as doing so will reduce the total protein content to be further purified by about 50%. 
[0141] To facilitate or enable purification of a desired protein or peptide, transposon-based vectors may include one 
or more additional epitopes or domains. Such epitopes or domains include DNA sequences encoding enzymatic or 
chemical cleavage sites including, but not limited to, an enterokinase cleavage site; the glutathione binding domain from 

50 glutathione S-transferase; polylysine; hexa-histidine or other cationic amino acids; thioredoxin; hemagglutinin antigen; 
maltose binding protein; a fragment of gp41 from HIV; and other purification epitopes or domains commonly known to 
one of skill in the art. 

[0142] In one representative embodiment, purification of desired proteins from egg white utilizes the antigenicity of 

the ovalbumin carrier protein and particular attributes of a TAG linker sequence that spans ovalbumin and the desired 
55 protein. The TAG sequence is particularly useful in this process because it contains 1) a highly antigenic epitope, a 
fragment of gp41 from HIV, allowing for stringent affinity purification, and, 2) a recognition site for the protease enterok- 
inase immediately juxtaposed to the desired protein. In a preferred embodiment, the TAG sequence comprises approx- 
imately 50 amino acids. A representative TAG sequence is provided below. 
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Pro Ala Asp Asp Ala Pro Ala Asp Asp Ala Pro Ala Asp Asp Ala Pro Ala Asp Asp 
Ala Pro Ala Asp Asp Ala Pro Ala Asp Asp Ala Thr Thr Cvs lie Leu Lvs Glv Ser Cvs 
Glv Trp He Glv Leu Leu Asp Asp Asp Asp Lvs (SEP ID N0:3S) 

The underlined sequences were taken from the hairpin loop domain of HIV gp-41 {SEQ ID NO:33). Sequences in italics 
represent the cleavage site for enterokinase (SEQ ID NO:34). The spacer sequence upstream of the loop domain was 
made from repeats of (Pro Ala Asp Asp Ala) (SEQ ID N0:31) to provide free rotation and promote surface availability 

of the hairpin loop from the ovalbumin carrier protein. 

[0143] Isolation and purification of a desired protein is performed as follows; 

1 . Enrichment of the egg white protein fraction containing ovalbumin and the transgenic ovalbumln-TAG-desired 
protein. 

2. Size exclusion chromatography to isolatH only those proteins within a nan-ow range of molecular weights (a further 

enrichment of step 1). 

3. Ovalbumin affinity chromatography. Highly specific antibodies to ovalbumin will eliminate virtually all extraneous 
egg white proteins except ovalbumin and the transgenic ovalbumln-TAG-desired protein. 

4. gp41 affinity chromatography using anti-gp41 antibodies. Stringent application of this step will result in virtually 

pure transgenic ovalbumin- TAG-desired protein. 

5. Cleavage of the transgene product can be accomplished in at least one of two ways: 

a. The transgenic ovalbumin-TAG-deslred protein Is left attached to the gp41 affinity resin (beads) from step 4 
and the protease enterokinase is added. This liberates the transgene target protein from the gp41 affinity resin 

while the ovalbumin-TAG sequence is retained. Separation by centrifugation (in a batch process) orflow through 
(in a column purification), leaves the desired protein together with enterokinase in solution. Enterokinase is 
recovered and reused. 

b. Alternatively, enterodnase is Immobilized on resin (beads) by the addition of poly-lyslne moieties to a non- 
catalytic area of the protease. The transgenic ovalbumln-TAG-deslred protein eluted from the affinity column 

of step 4 is then applied to the protease resin. Protease action cleaves the ovalbumin-TAG sequence from the 
desired protein and leaves both entities In solution. The Immobilized enterokinase resin is recharged and reused. 

c. The choice of these alternatives Is made depending upon the size and chemical composition of the transgene 
target protein. 

6. A final separation of either of these two (5a or 5b) protein mixtures is made using size exclusion, or enterokinase 
affinity chromatography. This step allows for desalting, buffer exchange and/or polishing, as needed. 

[0144] Cleavage of the transgene product (ovalbumin-TAG-deslred protein) by enterokinase, then, results In two 
products: ovalbumin-TAG and the desired protein. More specific methods for Isolation using the TAG label is provided 
in the Examples. Some desired proteins may require additions or modifications of the above-described approach as 
known to one of ordinary skill In the art. The method Is scaleable from the laboratory bench to pilot and production facility 
largely because the techniques applied are well documented in each of these settings. 

[0145] In another representative embodiment, egg whites containing a protein of interest were pooled and separated, 
in any order, from the yolks and other egg constituents by methods known to one skilled in the art A variety of such 
methods is described in manuals known in the art, such as Egg Science & Technology, W. Stadelman, et al. (Eds.), 
Haworth Press, Binghamton, NY (1995). 

[0146] One non-llmiting example of a method for Isolating a desired peptide, polypeptide or protein from an egg white 

is as follows. It is to be understood that this method may be employed to isolate any desired peptide, polypeptide or 
protein from the eggs of transgenic animals of the present invention. This present example involved transgenes that 
used a portion of or the entire Ovalbumin protein, or specific ovalbumin epitopes, as a carrier, linked to the protein of 
interest viathe specified TAG sequence, or another affinity/cleavage sequence. The TAG sequence contains the hairpin 
loop epitope from HIV I followed by an enterokinase cleavage site. 

[0147] First, the viscosity of the egg white was lowered by subjecting the egg white to low shear forces of 3140 cps 
(Tung et al., 1 969). The resulting pourable solution was then filtered to remove chalazae. An ammonium sulfate precip- 
itation was then used to enrich thefraction of transgenic protein (see, for example. Practical Protein Chemistry A Handbook 
A. Darbre (Ed.), John Wiley & Sons Ltd, 1 986). Other methods of crude fractionation known in the art are also used as 
needed. The supernatant of this separation was then fractionated using size-exclusion chromatography, further enriching 
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the transgenic fusion protein fraction and eliminating tine ammonium sulfate from the material. The fusion protein was 
isolated by anti-ovalbumin affinity chromatography (batch or column) using methods known to one skilled in the art. This 
step may capture native ovalbumin in addition to an ovalbumin-transgene fusion protein. After elution from the anti- 
ovalbumin affinity resin, the transgenic protein was specifically isolated using anti-gp41 affinity chromatography (batch 
5 or column) using methods known to one skilled in the art. 

[0148] Cleavage of the transgene product from the carrier and the TAG sequences was accomplished in one of at 
least two ways: 

1) The transgenic ovalbumin-TAG-transgene target protein was left attached to the gp41 affinity resin and the 
10 protease enterokinase was added. Cleavage of the transgene by enterolonase liberated the transgene target protein 

from the gp41 affinity resin while the ovalbumin-TAG sequence was retained. Separation by centrifugation (in a 
batch process) orflow through (in a column purification), kept the transgene target protein together with enterokinase 
in solution. Enterokinase was recovered and reused. 

15 2) Alternatively, enterokinase was immobilized on resin (beads) by the addition of poly-lysins moieties to a non- 
catalytic area of the protease. The transgenic ovalbumin-TAG-transgene target protein was eluted from the gp41 
affinity chromatography resin and then applied to the protease resin. Protease action cleaved the ovalbumin-TAG 
sequence from the transgene target protein and left both entities in solution. The Immobilized enterokinase resin 
was recharged and reused. The choice between these alternatives is made on a case-by case basis, depending 

20 upon the size and chemical composition of the transgene target protein. 

[0149] A final separation of either of these two (process or 2) protein mixtures was made using size exclusion chro- 
matography, or enterokinase affinity chromatography. This step also allows for desalting, concentrating, buffer exchange 
and/or polishing, as needed. 

25 [01 50] It is believed that a typical chicken egg produced by a transgenic animal of the present invention will contain 

at least 0.001 mg, from about 0.001 to 1.0 mg, or from about 0.001 to 100.0 mg of exogenous protein, peptide or 
polypeptide, in addition to the normal constituents of egg white (or possibly replacing a small fraction of the latter). In 
some embodiments, a chicken egg will contain between 50 and 75 mg of exogenous protein. 
[0151] One of skill in the art will recognize that after biological expression or purification, the desired proteins, fragments 
30 thereof and peptides may possess a conformation substantially different than the native conformations of the proteins, 
fragments thereof and peptides. In this case, it is often necessary to denature and reduce protein and then to cause the 
protein to re-fold into the preferred confomiation. Methods of reducing and denaturing proteins and inducing re-folding 
are well known to those of skill in the art. 

35 Production of Pnatein or Peptide in Milk 

[0152] In addition to methods of producing eggs containing transgenic proteins or peptides, the present invention 
makes reference for comparison to methods for the production of milk containing transgenic proteins or peptides. These 
methods include the administration of 9 transposon-based vector described above to a mammal through the duct system. 

40 [0153] The transposon-based vector may contain a transposase operably-linked to a constitutive promoter and a gene 
of interest operably-linked to mammary specific promoter. Genes of interest can include, but are not limited to antiviral 
and antibacterial proteins and immunoglobulins. Further, a transposon-based vector may be administered to the ovary 
of an animal and germline transformation is obtained. In such cases, offspring of the transfected animal express a gene 
of interest in the mammary gland under the control of a mammary gland-specific promoter. 

'f5 [0154] The following examples will serve to further illustrate the present invention without, at the same time, however, 
constituting any limitation thereof. On the contrary, it is to be clearly understood that resort may be had to various 
embodiments, modifications and equivalents thereof which, after reading the description herein, maysuggestthemselves 
to those skilled in the art without departing from the spirit of the invention. 

50 EXAMPLE 1 

IntraOviduct Administration of Transposon-Based Vectors 

[0155] Quail or chicken were selected for administration of the transposon-based vectors of the present invention. 
55 Feathers were removed from the area where surgery was performed and the area was cleansed and sterilized by rinsing 
it with ethanol (alcohol) and 0.5% chlorhexidine. Using the scalpel, a dorsolateral incision was made through the skin 
over the ovary approximately 2 cm in length Using blunt scissors, a second incision was made through the muscle 
between the last two ribs to expose the oviduct beneath. A small animal retractor was used to spread the last two ribs. 
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exposing the oviduct beneath. The oviduct was further exposed using retractors to pull the intestines to one side. 
[0156] A delivery solution containing a transposon-based vector and SUPERFECT®was prepared fresh immediately 
before surgery. Specific ratios of vector and SUPERFECT® that were used in each experiment are provided in the 
Examples below. The delivery solution was warmed to room temperature prior to injection into the bird. Approximately 
5 250-500 (j,l of the delivery solution was injected into the lumen of the magnum of the oviduct using a 1 cc syringe with 
a 27 gauge needle attached The wound was closed and antibiotic cream liberally applied to the area surrounding the 
wound. 

EXAMPLE 2 

10 

Preparation of Transposon-Based Vector pTrtMod 

[0157] A vector was designed for inserting a desired coding sequence into the genome of eul<aryotic cells, given below 
as SEQ ID N0:3. The vector of SEQ ID N0:3, termed pTnl\^od, was constructed and its sequence verified. 

15 [0158] This vector employed a cytomegalovirus (CMV) promoter. A modified Kozal< sequence (ACCATG) (SEQ ID 
N0:1) was added to the promoter. The nucleotides in the wobble position in nucleotide triplet codons encoding the first 
10 amino acids of transposase was changed to an adenine (A) or thymine (T), which did not alter the amino acid encoded 
by this codon. Two stop codons were added and a synthetic polyA was used to provide a strong termination sequence. 
This vector uses a promoter designed to be active soon after entering the cell (without any induction) to increase the 

20 likelihood of stable integration. The additional stop codons and synthetic polyA insures proper termination without read 
through to potential genes downstream 

[0159] The first step in constructing this vector was to modify the transposase to have the desired changes. IVIodifi- 
cations to the transposase were accomplished with the primers High Efficiency forward primer (Hef) Altered transposase 
(ATS)-Hef 5' ATCTCGAGACCATGTGTGAACTTGATATnTACATGATTCTCTTTACC 3' (SEQ ID NO:36) and Altered 

25 transposase- High efficiency reverse primer (Her) 5' GATTGATCATTATCATAATTTCCCCAAAGGGTAACC 3' (SEQ ID 
NO:37, a reverse complement primer). In the 5' forward primer ATS-Hef, the sequence CTCGAG (SEQ ID N0:38) is the 
recognition site for the restriction enzyme Xho I, which permits directional cloning of the amplified gene. The sequence 
ACCATG (SEQ ID N0:1 ) contains the Kozal< sequence and start codon for the transposase and the underlined bases 
represent changes in the wobble position to an A or T of codons for the first 1 0 amino acids (without changing the amino 

30 acid coded by the codon). Primer ATS-Her (SEQ ID NO:37) contains an additional stop codon TAA in addition to native 
stop codon TGA and adds a Bel I restriction site, TGATCA (SEQ ID NO:39), to allow directional cloning. These primers 
were used in a PGR reaction with pTnLac (p defines plasmid, tn defines transposon, and lac defines the beta fragment 
of the lactose gene, which contains a multiple cloning site) as the template for the transposase and a FaiiSafe™ PGR 
System (which includes enzyme, buffers, dNTP's, MgCl2 and PGR Enhancer; Epicentre Technologies, IVIadison, Wl). 

35 Amplified PGR product was electrophoresed on a 1 % agarose gel, stained with ethidium bromide, and visualized on an 
ultraviolet transilluminator A band corresponding to the expected size was excised from the gel and purified from the 
agarose using a Zymo Glean Gel Recovery Kit (Zymo Research, Orange, CA). Purified DNA was digested with restriction 
enzymes Xho i (5') and Bel I (3') (New England Biolabs, Beverly, iVIA) according to the manufacturer's protocol. Digested 
DNA was purified from restriction enzymes using a Zymo DNA Glean and Concentrator l<it (Zymo Research). 

40 [0160] Plasmid gWhiz (Gene Therapy Systems, San Diego, CA) was digested with restriction enzymes Sal I and 
BamH I (New England Biolabs), which are compatibly with Xho I and Bel I, but destroy the restriction sites. Digested 
gWhIz was separated on an agarose gel, the desired band excised and purified as described above. Cutting the vector 
in this mannerfaciiitated directional cloning of the modified transposase (mATS) between the CMV promoter and synthetic 
polyA. 

45 [0161] To insert the mATS between the CMV promoter and synthetic polyA in gWhiz, a Stratagene T4 Ligase Kit 
(Stratagene, Inc. La Jolla, CA)was used and the ligation set up according to the manufacturer's protocol. Ligated product 
was transformed into £. co//Top10 competent cells (Invitrogen Life Technologies, Carlsbad, CA) using chemical trans- 
formation according to invitrogen's protocol. Transformed bacteria were Incubated In 1 mi of SOC (GIBCO BRL, CAT# 
15544-042) medium for 1 hour at 37° C before being spread to LB (Luria-Bertani media (broth or agar)) plates supple- 
so mented with 100|xg/ml ampicillin (LB/amp plates). These plates were incubated overnight at 37° C and resulting colonies 
picked to LB/amp broth for overnight growth at 37° C. Plasmid DNA was isolated using a modified alkaline lysis protocol 
(Sambrook et al., 1989), electrophoresed on a 1% agarose gel, and visualized on a U.V. transilluminator after ethidium 
bromide staining. Colonies producing a plasmid of the expected size (approximately 6.4 kbp) were cultured in at least 
250 ml of LB/amp broth and plasmid DNA harvested using a Qiagen Maxi-Prep Kit (column purification) according to 
55 the manufacturer's protocol (Qiagen, Inc., Chatsworth, OA). Column purified DNA was used as template for sequencing 
to verify the changes made in the transposase were the desired changes and no further changes or mutations occurred 
due to PGR amplification. For sequencing, Perkin-Elmer's Big Dye Sequencing Kit was used. All samples were sent to 
the Gene Probes and Expression Laboratory (LSU School of Veterinary Medicine) for sequencing on a Peridn-Elmer 
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Model 377 Automated Sequencer.. 

[0162] Once a clone was identified that contained tlie desired mATS in the correct orientation, primers CMVf-NgoM 
IV (5' TT GCCGGC ATCAGATTGGCTAT (SBQ ID NO:40); underlined bases denote a NgoM IV recognition site) and 
Syn-polyA-BstE II (5' AG AGGTCACC GGGTCAATTCTTCAGCACCTGGTA (SEQ ID N0:41); underlined bases denote 
5 a BstE II recognition site) were used to PGR amplify the entire CIVIV promoter, mATS, and synthetic polyA for cloning 
upstream of the transposon in pTnLac. The PGR was conducted with FailSafe™ as described above, purified using the 
Zymo Clean and Concentrator kit, the ends digested with NgoM IV and BstE II (New England Biolabs), purified with the 
Zymo kit again and cloned upstream of the transposon in pTnLac as described below. 

[0163] Plasmid pTnLac was digested with NgoM IV and BstE II to remove the ptac promoter and transposase and the 
'0 fragments separated on an agarose gel. The band corresponding to the vector and transposon was excised, purified 

from the agarose, and dephosphorylated with calf intestinal alkaline phosphatase (New England Biolabs) to prevent 
self-annealing. The enzyme was removed from the vector using a Zymo DNA Clean and Concentrator-5. The purified 
vector and CMVp/mATS/polyA were ligated together using a Stratagene T4 Ligase Kit and transformed into E. coli as 
described above. 

15 [0164] Colonies resu Iting from this transfomiation were screened (mini-preps) as describe above and clones that were 
the correct size were verified by DNA sequence analysis as described above. The vector was given the name pTnMod 

(SEQ ID N0:3) and includes the following components: 

Base pairs 1-130 are a remainder of Fl(-) on from pBluescriptll sk(-) (Stratagene), corresponding to base pains 
20 1 -1 30 of pBluescriptll sk(-). 

Base pairs 131-132 are a residue from ligation of restriction enzyme sites used in constructing the vector. 
Base pairs 1 33 -1 777 are the CMV promoter/enhancer taken from vector pGWiz (Gene Therapy Systems), corre- 
sponding to bp 229-1 873 of pGWiz. The ' GMV promoter was modified by the addition of an AGG sequence upstream 
of ATG. 

25 Base pairs 1 778-1 779 are a residue from ligation of restriction enzyme sites used in constructing the vector. 

Base pairs 1780 - 2987 are the coding sequence for the transposase, modified from TnIO (GenBank accession 
J01 829) by optimizing codons for stability of the transposase mRNA and for the expression of protein. IVIore spe- 
cifically, in each of the codons for the first ten amino acids of the transposase, G or G was changed to A or T when 
such a substitution would not alter the amino acid that was encoded. 

30 Base pairs 2988-2993 are two engineered stop codons. 

Base pair 2994 is a residue from ligation of restriction enzyme sites used in constructing the vector. 

Base pairs 2995 - 3410 are a synthetic polyA sequence taken from the pGWiz vector (Gene Therapy Systems), 

corresponding to bp 1 922-2337 of 1 0 pGWiz. 

Base pairs 341 5 - 371 8 are non-coding DNA that is residual from vector pNK2859. 
35 Base pairs 371 9 - 3761 are non-coding X DNA that is residual from pNK2859. 

Base pairs 3762 - 3831 are the 70 bp of the left insertion sequence recognized by the transposon Tnl 0. 

Base pairs 3832-3837 are a residue from ligation of restriction enzyme sites used in constructing the vector. 

Base pairs 3838 - 4527 are the multiple cloning site from pBluescriptll sk(20), con'esponding to bp 924-235 of 

pBluescriptll sk(-). This multiple cloning site may be used to insert any coding sequence of interest into the vector. 
40 Base pairs 4528-4532 are a residue from ligation of restriction enzyme sites used in constructing the vector. 

Base pairs 4533 - 4602 are the 70 bp of the right insertion sequence recognized by the transposon Tnl 0. 

Base pairs 4603 - 4644 are non-coding I DNA that is residual from pNK2859. 

Base pair 4645 - 5488 are non-coding DNA that is residual from pNK2859. 

Base pairs 5489 - 7689 are from the pBluescriptll sk(-) base vector- (Stratagene, I nc), corresponding to bp 761 -2961 

45 of pBluescriptll sk(-). 

[0185] Completing pTnMod is a pBlueScript backbone that contains a colE I origin of replication and an antibiotic 
resistance marker (ampiclllin). 

[0166] It should be noted that all non-coding DNA sequences described above can be replaced with any other non- 
50 coding DNA sequence(s). Missing nucleotide sequences in the above construct represent restriction site remnants. 
[0167] All plasmid DNA was isolated by standard procedures. Briefly, Escherichia co// containing the plasmid was 
grown in 500 mL aliquots of LB broth (supplemented with an appropriate antibiotic) at 37°G ovemight with shaking. 
Plasmid DNA was recovered from the bacteria using a Qiagen Maxi-Prep kit (Qiagen, Inc., Chatsworth, GA) according 
to the manufacturer's protocol. Plasmid DNA was resuspended in 500 jiL of PCR-grade water and stored at -20°C until 
55 used. 
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EXAMPLE 3 

Transposon-Based Vector pTnMCS 

5 [0168] Another transposon-based vector was designed for inserting a desired coding sequence into tlie genome of 
euloryotic cells. This vector was temned pTnMCS and its constituents are provided below. The sequence of the pTnMCS 
vector is provided in SEQ ID N0:2. The pTnMCS vector contains an avian optimized polyA sequence operably-linked 
to the transposase gene. The avian optimized polyA sequence contains approximately 40 nucleotides that precede the 
A nucleotide string. 

10 

Bp 1-130 Remainder of F1 (-) ori of pBluescriptll sk(-) (Stratagene) bpl-130 

Bp 1 33 - 1 777 CMV promoter/enhancer taken from vector pGWIZ (Gene Therapy Systems) bp 229-1 873 
Bp 1783 - 2991 Transposase, fromTnIO (GenBank accession #J01B29) bp 108-1316 
Bp 2992 - 3344 Non coding DNA from vector pNK2859 
15 Bp 3345 - 3387 Lambda DNA from pNK2859 

Bp 3388 - 3457 70 bp of IS1 0 left from TnIO 

Bp 3464 - 3670 Multiple cloning site from pBluescriptll sk(-), thru the Xmal site bp 924-718 
Bp 3671 - 3715 Multiple cloning site from pBluescriptll sk(-), from the Xmal site thru the Xhol site. These base pairs 
are usually lost when cloning into pTnMCS bp 71 7-673 
20 Bp 371 6 - 41 53 Multiple cloning site from pBluescriptll sk(-), from the Xhol site bp 672-235 

Bp 41 59 - 4228 70 bp of IS1 0 right from TnIO 
Bp 4229 - 4270 Lambda DNA from pNK28S9 
Bp 4271 - 51 1 4 Non-coding DNA from pNK2859 

Bp 5115 - 7315 pBluescript sk (-) base vector (Stratagene, Inc.) bp 761-2961. 

25 

EXAMPLE 4 

Preparation of Transposon-Based Vector pThMod(Oval/ENT TAGIProlns/PA)-Chicken 

30 [01 69] A vector was designed to insert a humsan proinsuiin coding sequence under the control of a chicken ovalbumin 
promoter, and a ovalbumin gene including an ovalbumin signal sequence, into the genome of a bird given below as 
SEQ ID NO:42. 

Base pairs 1 - 130 are a remainder of Fl(-) ori of pBluescriptll sk(-) (Stratagene) corresponding to base pairs 1-130 
35 of pBluescriptll sk(-). 

Base pairs 133 - 1 777 are a CMV promoter/enhancer taken from vector pGWiz (Gene Therapy Systems) corre- 
sponding to base pairs 229-1 873 of pGWiz. 

Base pairs 1 780 - 2987 are a transposase, modified from Tnl 0 (GenBank accession number J01 829). 
Base pairs 2988-2993 are two engineered stop codons. 
40 Base pairs 2995 - 341 0 are a synthetic polyA from pGWiz (Gene Therapy Systems) corresponding to base pairs 
1922- 2337of pGWiz. 

Base pairs 3415 - 3718 are non coding DNA that is residual from vector pNK2859. 
Base pairs 371 9 - 3761 are X DNA that is residual from pNK2859. 

Base pairs 3762 - 3831 are the 70 base pairs of the left insertion sequence (IS1 0) recognized by the transposon Tnl 0. 
45 Base pairs 3838 - 4044 are a multiple cloning site from pBlueScriptll sk{-) corresponding to base pairs 924-71 8 of 

pBluescriptll sk(-). 

Base pairs 4050 - 4951 are a chicken ovalbumin promoter (including SDRE) that con-esponds to base pairs 431 -1 332 
of the chicken ovalbumin promoter in GenBank Accession Number J00895 M2499g. 

Base pairs 4958 - 6115 are a chicken ovalbumin signal sequence and ovalbumin gene that correspond to base 
50 pairs 66-1 223 of GenBank Accession Number V00383.1 . (The STOP codon being omitted). 

Base pairs 6122 - 6271 are a TAG. sequence containing a gp41 hairpin loop from HIV I, an enterokinase cleavage 

site and a spacer (synthetic). 

Base pairs 6272 - 6531 are a proinsuiin gene. 

Base pairs 6539 - 6891 are a synthetic polyadenylation sequence from pGWiz (Gene Therapy Systems) con-e- 
ss spending to base pairs 1920 - 2272 of pGWiz. 

Base pairs 6897 - 7329 are a multiple cloning site from pBlueScriptll sk(-) corresponding to base pairs 667-235 of 
pBluescriptll sk(-). 

Base pairs 7335- 7404 are the 70 base pairs of the right insertion sequence (IS1 0) recognized by the transposon Tnl 0. 
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Base pairs 7405 - 7446 axe X DNA that is residual from pNK2859. 

Base pairs 7447 - 831 1 are non coding DIMAthat is residual from pNK2859. 

Base pairs 831 2 - 1 051 2 are pBlueScript sl<(-) base vector (Stratagene, Inc.) corresponding to base pairs 761 -2961 

of pBluescriptll sk(-). 

5 

[0170] It should be noted that all non-coding DNA sequences described above can be replaced with any other non- 
coding DNA sequence(s). Missing nucleotide sequences in the above construct represent restriction site remnants. 

EXAMPLE 5 

10 

Transposon-Based Vector pTnMOD (CMV-CHOVg-ent-Prolnsulin-synPA) 

[0171] A vector was designed to insert a proinsulin coding sequence under the control of a quail ovalbumin promoter, 
and a ovalbumin gene including an ovalbumin signal sequence, Into the genome of a bird given below as SEQ ID NO:43. 

15 

Bp 1 - 4045 from vector pTnMod, bp 1 - 4045 

Bp 4051-5695 CMV promoter/enhancer taken from vector pGWiZ (Gene therapy systems), bp 230-1864 
Bp 5702 -6855 Chicken ovalbumin gene tal<en from GenBank accession # \/00383, bp 66-1219 
Bp 6862 - 701 1 Synthetic spacer sequence and hairpin loop of HIV gp41 with an added enteroklnase cleavage site 
20 Bp 7012 - 7272 Human Proinsulin taken from GenBank accession # NM000207, bp 1 1 7-377 

Bp 7273 - 731 7 Spacer DNA, derived as an artifact from the cloning vectors pTOPO Blunt II (Invltrogen) and pGWIZ 
(Gene Therapy Systems) 

Bp 7318 - 7670 Synthetic polyAfrom the cloning vector pGWIZ (Gene Therapy Systems), bp 1920-2271 
Bp 7672-1 1271 from cloning vector pTnMCS, bp 3716-7315 

25 

EXAMPLE 6 

Transfection of Japanese Quail using a Transposon-based Vector containing a Proinsulin Gene via Oviduct Injections 

30 [0172] Two experiments were conducted In Japanese quail using transpson-based vectors containing either Oval 
promoter/Oval gene/GP41 Enteroklnase TAG/Proinsulin/Poly A (SEQ ID NO:42) or GMV promoter/Oval gene/GP41 
Enteroklnase TAG/Proinsulin/Poly A (SEQ ID NO:43). 

[0173] In the first experiment, the Oval promoter/Oval gene/GP41 Enteroklnase TAG/Prolnsulin/Poly A containing 
construct was injected into the lumen of the oviduct of sexually mature quail; three hens received 5 p,g at a 1 :3 SUPER- 

35 FECT® ratio and three received 10 )i.g at a 1:3 SUPERFECT® ratio. As of the writing of the present application, at least 
one bird that received above-mentioned construct was producing human proinsulin in egg white (other birds remain to 
be tested). This experiment Indicates that 1 ) the DNA has been stable for at least 3 months; 2) protein levels are 
comparable to those observed with a constitutive promoter such as the GMV promoter, and 3) sexually mature birds 
can be injected and results obtained without the need for cell culture. It is estimated that each quail egg contains 

10 approximately 1 .4 |xg/ml of the proinsulin protein. It is also estimated that each transgenic chicken egg contains 50-75 
mg of protein encoded by the gene of interest 

[0174] In the second experiment, the transposon-based vector containing GMV promoter/Oval gene/GP41 Enterokl- 
nase TAG/Prolnsulin/Poly A was injected into the lumen of the oviduct of sexually immature Japanese quail. A total of 

9 birds were Injected. Of the 8 survivors, 3 produced human proinsulin in the white of their eggs for over 6 weeks. An 
'fs ELISA assay described In detail below was developed to detect GP41 in thefusion peptide (Oval gene/GP41 Enteroianase 
TAG/Prolnsulin) since the GP41 peptide sequence Is unique and not found as part of normal egg white protein. In all 
ELISA assays, the same birds produced positive results and all controls worked as expected. 
[0175] ELISA Procedure: Individual egg white samples were diluted In sodium carbonate buffer, pH 9.6, and added 
to Individual wells of 96 well microtlter ELISA plates at a total volume of 0.1 mi. These plates were then allowed to coat 
50 overnight at 4°C. Prior to ELISA development, the plates were allowed warm to room temperature. Upon decanting the 
coating solutions and blotting away any excess, non-specific binding of antibodies was blocked by adding a solution of 
phosphate buffered saline (PBS), 1 % (w/v) BSA, and 0.05% (v/v) Tween 20 and allowing It to incubate with shaking for 
a minimum of 45 minutes. This blocking solution was subsequently decanted and replaced with a solution of the primary 
antibody (Goat Anti-GP41 TAG) diluted in fresh PBS/BSA/Tween 20. After a two hour period of incubation with the 
55 primary antibody, each plate was washed with a solution of PBS and 0.05% Tween 20 in an automated plate washer to 
remove unbound antibody. Next, the secondary antibody, Rabbit anti-Goat Alkaline Phosphatase-conjugated, was diluted 
in PBS/BS/VTween 20 and allowed to incubate 1 hour. The plates were then subjected to a second wash with PBS/ 
Tween 20. Antigen was detected using a solution of p-Nitrophenyl Phosphate In Diethanolamine Substrate Buffer for 
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Alkaline Phosphatase and measuring the absorbance at 30 minutes and 1 hour. 

[0176] Additionally, a proinsulin fusion protein produced using a construct described above was isolated from egg 
white using ammonium sulfate precipitation and ion exchange, chromatography. A pooled fraction of the isolated fusion 
protein was run on an SDS-PAGE gel shown in Figure 5, lanes 4 and 6. Lanes 1 and 10 of the gel contain molecular 
5 weight standards, lanes 2 and 8 contain non-trangenic chicken egg white, whereas lanes 3, 5, 7 and 9 are blank. 

EXAMPLE 7 

Isolation of Human Proinsulin Using Anti-TAG Column Chromotography 

10 

[0177] A HiTrap NHS-activated 1 mL column (Amersham) was charged with a 30 amino acid peptide that contained 
the gp-41 epitope containing gp-41's native disulfide bond that stabilizes the formation of the gp-41 hairpin loop. The 
30 amino acid gp41 peptide is provided as SEQ ID NO:32. Approximately 1 0 mg of the peptide was dissolved in coupling 
buffer (0.2 M NaHC03, 0.5 M NaCI, pH 8.3 and the llgand was circulated on the column for 2 hours at room temperature 

15 at 0.5 mL/minute. Excess active groups were then deactivated using 6 column volumes of 0.5 M ethanolamlne, 0.5 M 
NaCI, pH 8.3 and the column was washed alternately with 6 column volumes of acetate buffer (0.1 M acetate, 0.5 M. 
NaCI, pH 4.0) and ethanolamine (above). The column was neutralized using 1 X PBS. The column was then washed 
with buffers to be used in affinity purification: 75 mM Trips, pH 8.0 and elution buffer, 100 mM glycine-HCI, 0.5 IVI NaCI, 
pH 2.7. Finally, the column was equilibrated in 75 mM Tris buffer, pH 8.0. 

20 [0178] Antibodies to gp-41 were raised in goats by Inoculation with the gp-41 peptide described above. More specifically, 
goats were inoculated, given a booster injection of the gp-41 peptide and blood samples were obtained byveinupuncture. 
Serum was harvested by centrifugation. Approximately 30 mL of goat serum was filtered to 0.45 uM and passed over 
a TAG column at a rate of 0.5 mL/min. The column was washed with 75 mM Tris, pH 8.0 until absorbance at 280 nm 
reached a baseline. Three column volumes (3 mL) of elution buffer (100 mM glycine, 0.5 M NaCI, pH 2.7) was applied, 

25 followed by 75 mM Tris buffer, pH 8.0, all at a rate of 0.5 mL/min. One milliliter fractions were collected. Fractions were 
collected into 200 uL 1 IVI Tris, pH 9.0 to neutralize acidic factions as rapidly as possible. A large peak eluted from the 
column, coincident with the application the elution buffer. Fractions were pooled. Analysis by SDS-PAGE showed a high 
molecular weight species that separated into two fragments under reducing condition, in keeping with the heavy and 
light chain structure of IgG. 

30 [0179] Pooled antibody fractions were used to charge two 1 mL HUrap NHS-activated columns, attached In series. 
Coupling was carried out In the same manner as that used for charging the TAG column. 

Isolation of Ovalbumin- TAG-Pro Insulin from Egg White 

35 [0180] Egg white from quail and chickens treated by intra-oviduct Injection of the CMV-ovalbumin-TAG-prolnsulin 
construct were pooled. Viscosity was lowered by subjecting the allantoid fluid to successively finer pore sizes using 
negative pressure filtration, finishing with a 0.22 |xM pore size. Through the process, egg white was diluted approximately 
1:16. The clarified sample was loaded on the Antl-TAG column and eluted In the same manner as described for the 
purification of the anti-TAG antibodies. A peak of absorbance at 280 nm, coincident with the application of the elution 

40 buffer, indicated that protein had been specifically eluted from the Anti-TAG column. Fractions containing the eluted 
peak were pooled for analysis. 

[0181] The pooled fractions from the Antl-TAG affinity column were characterized by SDS-PAGE and western blot 
analysis. SDS-PAGE of the pooled fractions revealed a 60 kDal molecular weight band not present In control egg white 

fluid, consistent with the predicted molecular weight of the transgenic protein. Although some contaminating bands were 
45 observed, the 60 kDal species was greatly enriched compared to the other proteins. An aliquot of the pooled fractions 
was cleaved overnight at room temperature with the protease, enterokinase. SDS-PAGE analysis of the cleavage product, 
revealed a band not present in the uncut material that co-migrated with a commercial human proinsulin positive control. 
Western blot analysis showed specific binding to the 60 kDal species under non-reducing condition (which preserved 
the hairpin epitope of gp-41 by retaining the disulfide bond). Western analysis of the low molecular weight species that 
50 appeared upon cleavage with an anti-human proinsulin antibody, conclusively identified the cleaved fragment as human 
proinsulin. 

EXAMPLE 8 

55 Purification Procedures for Insulin 

[0182] L ELISA data for egg characterization/identification 

[0183] An ELISA was employed for the initial screening of eggs and, thereby. Identification of hens producing positive 
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eggs. With further modifications this procedure was used for the initial quantification of recombinant protein amounts. 
These procedures were aided by the successful purification of an initial stock of the recombinant proinsulin (RPI). This 
stock of protein is used in the development of a double antibody assay that increases the sensitivity and reduces the 
background in the assay. Subsequent identification of hens producing positive eggs obviate the need to screen each 
5 egg collected. Only periodic checks are needed to detennine if production levels are consistent 

li. Egg White (EW) or Albumin Preparation 

A. Clarification - Ovomucin precipitation 

10 

[0184] Eggs from hens positively identified as producing RPI are pooled for RPI purification. The initiai purification 
step involved diluting the pool 1:1 with 100 mWl Tris-HCI, pH 8 for a final concentration of 50 mM Tris-HCI. The pH of 
this solution was then adjusted to 6 and ovomucin was allowed to precipitate at 4°C for a minimum of 3hrs (preferably 
overnight) with constant stirring. The precipitated ovomucin was then pelleted and removed by centrifugation at 2400 x 
15 g. After collection of the RPI containing supernatant, the plH of this solution was readjusted to 8. 

B. Filtration 

[0185] To prepare the egg white for loading onto the column and, thereby, minimize the potential for clogging the 
20 columns during loading, the egg white solution was filtered to at least 0.45 urn. 

[01 86] Initially, the ovomucin precipitated egg white solution was subjected to successive filtration steps with the pore 
size of the filtration membrane decreasing at each step. This procedure involved time and dilution of the egg white 
solution to reach 0.45 um filtration. 

[01 87] Amersham's hollow-fiber ultrafiltration apparatus was used to produced a column-ready solution filtered down 
25 to < 0.2 um with an undiluted starting solution. This approach minimized the time and the solution dilution needed to 
prepare the egg white solution for column loading. 

III. Purification 

30 A. Affinity Chromatography 

[0188] Using antibody with specificity to a synthetic peptide modeled after the enteroklnase recognition site, initial 
purification schemes involved developing a one-step column purification procedure for the RPI. 
[0189] Goats immunized with the synthetic Ent peptide were employed to produce anti-Ent Tag antiserum which was 
35 used in the egg screening ELISAs followed by antibody purification. The purified goat Anti-Ent Tag antibodies were 
covalently bound to the matrix of HiTrap NHS-activated HIP columns (Amersham) and subsequently used to specifically 
bind and purify the RPI. 

[0190] An initial attempt was made to direct the first purification step against the ovalbumin portion of the recombinant 
protein using an antibody specific for the ovalbumin port:ion. The present purification scheme employed a combination 

40 of classical techniques such as ammonium sulfate precipitation, ion exchange, and gel filtration chromatography. 

[0191] After the initial ovomucin precipitation, the egg white solution was subjected to protein precipitation using a 
40% ammonium sulfate fractionation. The precipitated protein was subsequently collected via centrifugation and resus- 
pended in 50 mM Tris-HCI, pH 8. The resuspended protein solution was dialyzed to remove residual (NH4)2S04 or 
subjected to gel filtration to remove the (NH4)2S04 and partially isolate the RPI from the remaining egg white protein. 

45 The RPI was further isolated via anion exchange chromatography using a 0 to 0.5M NaCI gradient in 50 mM Tris-HCI, 
pH 8. Two possible elution profiles were observed. One at approximately 25% of the 0.5 M NaCI gradient without 
(NH4)2S04 precipitation. The second was observed at less than 16% gradient (approximately 7%) following 40% 
(NH4)2S04 precipitation and a longer gradient Fractions containing RPI were identified by SDS-PAGE analysis and 
pooled. 

50 [0192] Three gel filtration columns, differing by column size and fractionation range, were employed in RPI purification 
and/or desalting. Superdex 75 10/300 GL, Hiload 26/60 Superdex 75, and Hiload 26/60 Superdex 200. Using these 
individualcolumns at different steps in the purification scheme increased the efficiency of theprocess. Fractions containing 
RPI were identified by SDS-PAGE analysis and pooled. 

[0193] Cleavage of the RPI Enteroklnase recognition site was accomplished using purified enteroklnase from Sigma. 
55 Enteroklnase, 0.004 Unit/(il per reaction, was applied to the pooled and, if necessary, concentrated protein solution. 
The digestion reaction was incubated at room temperature (up to 30°C in a rolling hybridization oven) for a minimum of 
16 h and in some cases up to 48 hrs of incubation. The digestion efficiency was followed using 1 6.5% Tris-Tricine SDS- 
PAGE peptide gels. All gel staining utilized Simply Blue Coomassie Staining Solutions. Free Proinsulin was observed 
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on gels after digestion. 

[0194] A subsequent gel filtration separation was employed to obtain purified Proinsuiin, and to remove the remaining 
Ovalbumin portion of tine RPI and residual native EW proteins. Select steps in the purification process were analyzed 
using the 2-dinnensional Beckman Coulter ProteomeLab PF2D Protein Fractionation System. 

5 

EXAMPLE 9 

Optimization of Intra-oviduct and Intra-ovarian Arterial Injections 

'0 [0195] Overall transfectlon rates of oviduct cells in a flock of chicken orquail hens are enhanced by synchronizing the 

development of the oviduct and ovary within the flock. When the development of the oviducts and ovaries are uniform 
across a group of hens and when the stage of oviduct and ovarian development can be detennined or predicted, timing 
of injections is optimized to transfectthe greatest number of cells. Accordingly, oviduct development is synchronized as 
described below to ensure that a large and uniform proportion of oviduct secretory cells are transfected with the gene 
15 of interest. 

[0196] Hens are treated with estradiol to stimulate oviduct maturation as described in Oka and Schimke (T. Oka and 
RT Schimke, J. Cell Biol., 41, 816 (1969)), Palmiter, Christensen and Schimke (J Biol. Chem. 245(4):833-845, 1970). 
Specifically, repeater daily injections of 1 mg estradiol benzoate are performed sometime before the onset of sexual 
maturation, a period ranging from 1-14 weeks of age. After a stimulation period sufficient to maximize development of 

20 the oviduct, hormone treatment is withdrawn thereby causing regression In oviduct secretory cell size but not cell number. 
At an optimum time after hormone withdrawal, the lumens of the oviducts of treated hens are injected with the transposon- 
based vector. Hens are subjected to additional estrogen stimulation after an optimized time during which thetransposon- 
based vector is taken up into oviduct secretory cells. Re-stimulation by estrogen activates transposon expression, causing 
the integration of the gene of interest into the host genome. Estrogen stimulation Is then withdrawn and hens continue 

25 normal sexual development. If a developmentally regulated promotersuch as the ovalbumin promoter Is used, expression 
of the transposon-based vector initiates in the oviduct at the time of sexual maturation Intra-ovarian artery injection 
during this window allows for high and uniform transfection efficiencies of ovarian follicles to produce germ-line trans- 
fections and possibly oviduct expression. 

[0197] Other means are also used to synchronize the development, or regression, of the oviduct and ovary to allow 
30 high and uniform transfection efficiencies. Alterations of lightning and/or feed regimens, for example, cause hens to 

'molt' during which time the oviduct and ovary regress. Molting is used to synchronize hens for transfection, and may 
be used in conjunction with other homnonal methods to control regression and/or development of the oviduct and ovary. 

EXAMPLE 10 

35 

Preparation of Trivisposon-Based Vector pTnMod(oval/ENT TAG/Prolns/PA)-Quall 

[0198] A vector is designed for Inserting a proinsuiin gene under the control of a quail ovalbumin promoter, and a 
ovalbumin gene including an ovalbumin signal sequence, into the genome of a bird given below as SEQ ID NO:44. 

40 

Base pairs 1 -1 30 are a remainder of F1 (-) ori of pBluescriptll sk(-) (Stratagene) corresponding to base pairs 1 -1 30 
of pBluescriptll sk(-). 

Base pairs 133 - 1777 are a CMV promoter/enhancer taken from vector pGWiz (Gene Therapy Systems) corre- 
sponding to base pairs 229-1873 of pGWiz. 
45 Base pairs 1780 - 2987 are atransposase, modified from TnIO (GenBank accession number J01 829). 

Base pairs 2988-2993 are an engineered stop codon. 

Base pairs 2995 - 3410 are a synthetic polyAfrom pGWiz (Gene Therapy Systems) corresponding to base pairs 
1922- 2337of pGWIz. 

Base pairs 3415 - 3718 are non coding DNAthat is residual from vector pNK2859. 
so Base pairs 3719 - 3761 are I DNAthat is residual from pNK2859. 

Base pairs 3762 - 3831 are the 70 base pairs of the left insertion sequence (IS1 0) recognized by the transposon Tnl 0. 
Base pairs 3838 - 4044 are a multiple cloning site from pBlueScrlptll sk(-) corresponding to base pairs 924-71 8 of 
pBluescriptll sk(-). 

Base pairs 4050 - 4938 are the Japanese quail ovalbumin promoter (including SDRE, steroid-dependent response 
55 element). The Japanese quail ovalbumin promoter was isolated by its high degree of homology to the chicken 

ovalbumin promoter (GenBank accession number J00895 M24999, base pairs 431-1332). Some deletions were 
noted in the quail sequence, as compared to the chicken sequence. 

Base pairs 4945 - 6092 are a quail ovalbumin signal sequence and ovalbumin gene that corresponds to base pairs 
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54 - 1201 of GenBank accession number X53964.1 . (The STOP codon being omitted). 

Base pairs 6093 - 6246 are a TAG sequence containing a gp41 hairpin loop from HIV I an enterokinase cleavage 
site and a spacer (synthetic). 
Base pairs 6247 - 6507 are a proinsulin gene. 
5 Base pairs 6514 - 6866 are a synthetic polyadenylation sequence from pGWiz (Gene Therapy Systems) con-e- 

sponding to base pairs 1920 - 2272of pGWiz. 

Base pairs 6867 - 7303 are a multiple cloning site from pBlueScriptll sk(-) corresponding to base pairs 667-235 of 
pBluescriptll sk(-). 

Base pairs 7304- 7379 are the 70 base pairs of the right insertion sequence (IS1 0) recognized by the transposon Tn 1 0. 
10 Base pairs 7380 - 7421 are I DNA that is residual from pNK2859. 

Base pairs 7422 - 8286 are non coding DNA that is residual from pNK2859. 

Base pairs 8287 - 1 0487 are pBlueScript sk(-) base vector (Stratagene, Inc.) corresponding to base pairs 761 -2961 
of pBluescriptll sk(-). 

15 [0199] It should be noted that all non-coding DNA sequences described above can be replaced with any other non- 
coding DNA sequence(s). Missing nucleotide sequences in the above construct represent restriction site remnants. 

EXAMPLE 11 

20 Preparation of Tramposon-Based Vector pTnMod(Oval/ENT TAGIp146IPA) - Chicken 

[0200] A vector was designed for inserting a pi 46 gene under the control of a chicken ovalbumin promoter, and a 
ovalbumin gene including an ovalbumin signal sequence, into the genome of a bird. The vector sequence is provided 
below as SEQ ID NO:45. 

25 

Base pairs 1 - 1 30 are a remainder of F1 (-) ori of pBluescriptll sk(-) (Stratagene) corresponding to base pairs 1 -1 30 

of pBluescriptll sk(-). 

Base pairs 133 - 1777 are a CMV promoter/enhancer taken from vector pGWiz (Gene Therapy Systems) corre- 
sponding to base pairs 229-1 873 of pGWiz. 
30 Base pairs 1 780 - 2987 are a transposase, modified from Tnl 0 (GenBank accession number J01 829). 

Base pairs 2988-2993 are an engineered stop codon. 

Base pairs 2995 - 3410 are a synthetic polyA from pGWiz (Gene therapy Systems) corresponding to base pairs 
1922- 2337of pGWiz. 

Base pairs 3415 - 3718 are non coding DNA that is residual from vector pNK2859. 
35 Base pairs 371 9 - 3761 are X DNA that is residual from punk2859. 

Base pairs 3762 - 3831 are the 70 base pairs of the left insertion sequence (IS1 0) recognized by the transposon Tnl 0. 
Base pairs 3838 - 4044 are a multiple cloning site from pBlueScriptll sk(-) corresponding to base pairs 924-71 8 of 
pBluescriptll sk(-). 

Base pairs 4050 - 4951 are a chicken ovalbumin promoter (including SDRE, steroid-dependent response element) 
40 that corresponds to base pairs 431 -1332 of the chicken ovalbumin promoter in GenBank Accession Number J00895 
M24999. 

Base pairs 4958 - 61 15 are a chicken ovalbumin signal sequence and Ovalbumin gene that con'espond to base 

pairs 66-1223 of GenBank Accession Number V00383.1 (The STOP codon being omitted). 

Base pairs 6122 - 6271 are a TAG sequence containing a gp41 hairpin loop from HIV I, an enterokinase cleavage 

45 site and a spacer (synthetic). 

Base pairs 6272 - 6316 are a pi 46 sequence (synthetic) with 2 added stop codons. 

Base pairs 6324 - 6676 are a synthetic polyadenylation sequence from pGWiz (Gene Therapy Systems) corre- 
sponding to base pairs 1920 - 2272of pGWiz. 

Base pairs 6682 - 71 14 are a multiple cloning sitB from pBlueScriptll sk(-) corresponding to base pairs 667-235 of 
so pBluescriptll sk(-). 

Base pairs 71 20- 71 89 are the 70 base pairs of the rightinsertion sequence (IS 10) recognized bythe transposon Tnl 0. 

Base pairs 7190 - 7231 are X DNA that is residual from pNK2859. 

Base pairs 7232 - 8096 are non coding DNA that is residual from pNK2859. 

Base pairs 8097 - 1 0297 are pBlueScript sk(-) base vector (Stratagene, Inc.) corresponding to base pairs 761 2961 

55 of pBluescriptll sk(-). 

[0201] It should be noted that all non-coding DNA sequences described above can be replaced with any other non- 
coding DNA sequence(s). Missing nucleotide sequences in the above construct represent restriction site remnants. 
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EXAMPLE 12 

Preparation of Transposon-Based Vector pTnMod(Oval/ENT TAG/p146fPA) - Quail 

5 [0202] A vector was designed for inserting a p146 gene under the control of a quail ovalbumin promoter, and a 
ovalbumi n gene including an Ovalbumin signal sequence, into the genome of a bird. The vector sequence is given below 

asSEQ ID NO:46. 

Base pairs 1 - 1 30 are a remainder of F1 (-) ori of pBluescriptB sk(-) (Stratagene) con-esponding to base pairs 1-130 
10 of pBluescriptll sk(-). 

Base pairs 133 - 1 777 are a CMV promoter/enhancer taken from vector pGWiz (Gene Therapy Systems) con-e- 

sponding to base pairs 229-1 873 of pGWiz. 

Base pairs 1780 - 2987 are a transposase, modified from Tn10 (GenBank accession number J01 829). 
Base pairs 2988-2993 are an engineered stop codon. 
15 Base pairs 2995 - 341 0 are a synthetic polyA from pGWIz (Gene Therapy Systems) corresponding to base pairs 

1 922-2337 of pGWiz. 

base pairs 3415 - 3718 are non coding DNA that, is residual from vector punk2859. 
Base pairs 3719 - 3761 are I DNA that is residual from pNK2859. 

Base pairs 3762 - 3831 are the 70 base pairs of the left insertion sequence (IS1 0) recognized by the transposon Tn1 0. 
20 Base pairs 3838 - 4044 are a multiple cloning site from pBlueScriptll sk(-) corresponding to base pairs 924-71 8 of 

pBluescriptll sk(-). 

Base pairs 4050 - 4938 are the Japanese quail ovalbumin promoter (including SDRE, steroid-dependent response 
element). The Japanese quail ovalbumin promoter was isolated by its high degree of homology to the chicken 
ovalbumin promoter (GenBank accession number J00895 IVI24999, base pairs 431-1332). 
25 Bp 4945 - 6092 are a quail ovalbumin signal sequence and ovalbumin gene that corresponds to base pairs 54 - 

1201 of GenBank accession number X53964.1 . (The STOP codon being omitted). 

Base pairs 6097 - 6246 are a TAG sequence containing a gp41 hairpin loop from HIV I, an enterokinase cleavage 
site and a spacer (synthetic). 

Base pairs 6247 - 6291 are a pi 46 sequence (synthetic) with 2 added stop codons. 
30 Base pairs 6299 - 6651 are a synthetic polyadenylation sequence from pGWiz (Gene Therapy Systems) corre- 

sponding to base pairs 1920 - 2272 of pGWiz. 

Base pairs 6657 - 7089 are a multiple cloning site from pBlueScriptll sk(-) corresponding to base pairs 667-235 of 
pBluescriptll sk(-). 

Base pairs 7095- 71 64 are the 70 base pairs of the right insertion sequence (IS1 0) recognized by the transposon Tnl 0. 
35 Base pairs 71 65 - 7206 are X DNA that is residual from pNK2859. 

Base pairs 7207 - 8071 are non coding DNA that is residual from pNK2859. 

Base pairs 8072 - 1 0272 are pBlueScript sk(-) base vector (Stratagene, Inc.) corresponding to base pairs 761 -2961 of 
pBluescriptll sk(-). 

40 [0203] It should be noted that ail non-coding DNA sequences described above can be replaced with any other non- 
coding DNA sequence(s). Missing nucleotide sequences in the above construct represent restriction site remnants. 

EXAMPLE 13 

'fs Additional Transposon-Based Vectors for Administration to an Animal 

[0204] The following example provides a description of various transposon-based vectors of the present invention 
and several constructs that have been made for Insertion into the transposon-based vectors of the present invention. 

These examples are not meant to be limiting in any way. The constructs for insertion into a transposon-based vector 
50 are provided in a cloning vector pTnMGS or pTuMod, both described above. 

pTnMGS (CMV-CHOVg-ent-Prolnsulin-synPA) (SEQ ID NO: 47) 

Bp 1-3670 from vector PTnlVIGS, bp 1-3670 
55 Bp 3676 - 5320 CMV promoter/enhancer taken from vector pGWIZ (Gene Therapy Systems), bp 230-1 864 

Bp 5327 -6480 Chicken ovalbumin gene taken from GenBank accession # V00383, bp 66-1 21 9 
Bp 6487 - 6636 Syntheticspacer sequence and hairpin loop of HIV gp41 with an added enterokinase cleavage site 
Bp 6637 - 6897 Human Proinsulin taken from GenBank accession # NM000207, bp 1 1 7-377 
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Bp 6898 - 6942 Spacer DNA, derived as an artifact from the cloning vectors pTOPO Blunt II (Invltrogen) and 

pGWIZ (Gene Therapy Systems) 

Bp 6943 - 7295 Synthetic pclyA from the cloning vector pGWIZ (Gene Therapy Systems), bp 1 920-2271 
Bp 7296 -10895 from cloning vector pTnMCS, bp 3716-7315 

pTnMCS (CMV-prepro-ent-Prolnsulin-synPA) 

Bp 1-3670 from vector PTnMCS, bp 1 - 3670 

Bp 3676 - 5320 GMV promoter/enhancer taken from vector pGWIZ (Gene Therapy Systems), bp 230-1 864 

Bp 5326 - 5496 Capsite/prepro taken from GenBank accession # X07404, bp 563-733 Bp 5504 - 5652 Synthetic 

spacer sequence and hairpin loop of HIV gp41 with an added enterokinase cleavage site 

Bp 5653 - 5913 Human Prcinsulin taken from GenBank accession # NM000207, bp 117-377 

Bp 5914 - 5958 Spacer DNA, derived as an artifact from the cloning vectors pTOPO Blunt II (Invitrogen) and 

pGWIZ (Gene Therapy Systems) 

Bp 5959-6310 Synthetic polyA from the cloning vector pGWIZ (Gene Therapy Systems), bp 1920-2271 
Bp 6313-9912 from cloning vector pTnMCS, bp 3716-7315 

pTnMCS(Chicken OVep+OVg'+ENT+proins+syn polyA) 

Bp 1-3670 from vector pTnMCS, bp 1- 3670 

Bp 3676-4350 Chicken Ovalbumin enhancertaken from GenBank accession #582527.1 bp 1-675 
Bp 4357-5692 Chicken Ovalbumin promoter taken from GenBank accession # J00895M24999 bp 1 -1336 
Bp 5699-6917 Chicken Ovalbumin gene from GenBank Accession # V00383.1 bp 2-1220. (This sequence 
includes the 5'UTR, containing putative cap site, bp 5699-5762.) 

Bp 6924-7073 Synthetic spacer sequence and hairpin loop of HI V gp41 with an added enterokinase cleavage site 

Bp 7074-7334 Human prolnsulin GenBank Accession # Nl\/I000207 bp 1 1 7-377 

Bp 7335-7379 Spacer DNA, derived as an artifact from the cloning vectors pTOPO Blunt 11 (Invitrogen) and 
gWIZ (Gene Therapy Systems) 

Bp 7380-7731 Synthetic polyA from the cloning vector gWIZ (Gene Therapy Systems) bp 1 920 - 2271 
Bp 7733-1 1 332 from vector pTnMCS, bp 371 6-7315 

pTnMCS(Chicken OVep+prepro+ENT+proins+syn pclyA) 

Bp 1 - 3670 from cloning vector pTnMCS, bp 1 - 3670 

Bp 3676 - 4350 Chicken Ovalbumin enhancertaken from GenBank accession # S82527.1 bp 1-675 

Bp 4357 - 5692 Chicken Ovalbumin promoter taken from GenBank accession # J00895-M24999 bp 1-1336 

Bp 5699-5869 Cecropin cap site and prepro, Genbank accession # X07404 bp 563-733 

Bp 5876 - 6025 Synthetic spacer sequence and hairpin loop of HIV gp41 with an added enterokinase, cleavage 

site 

Bp 6026 - 6286 Human prolnsulin GenBank Accession # NM000207 bp 1 17-377 

Bp 6287 - 6331 Spacer DNA, derived as an artifact from the cloning vectors pTOPO Blunt II (Invitrogen) and 
gWIZ (Gene Therapy Systems) 

Bp 6332 - 6683 Synthetic polyA from the cloning vector gWIZ (Gene Therapy Systems) bp 1920-2271 
Bp 6685 -10284 from cloning vector pTnMCS, bp 3716 - 7315 

pTnMCS(Quail OVep+QVg'+ENT+proins+syn polyA) 

Bp 1- 3670 from cloning vector pTnMCS, bp 1- 3670 

Bp 3676 - 4333 Quail Ovalbumin enhancer: 658 bp sequence, amplified in-house from quail genomic DNA, 
roughly equivalent to the far-upstream chicken ovalbumin enhancer, GenBankaccession #382527.1, bp 1-675. 
(There are multiple base pair substitutions and deletions in the quail sequence, relative tochicken, so the number 
of bases does not correspond exactly.) 

Bp 4340 - 5705 Quail Ovalbumin promoter. 1366 bp sequence, amplified in-house from quail genomic DNA, 

roughly corresponding to chicken ovalbumin promoter, GenBankaccession#J00895-M24999 bp 1-1336. (There 
are multiple base pair substitutions and deletions between the quail and chicken sequences, so the number of 
bases does not correspond exactly.) 

Bp 5712 - 6910 Quail Ovalbumin gene, ElVIHL accession # X53964, bp 1-1 199. (This sequence includes the 
5'UTR, containing putative cap site bp 5712-5764.) 
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Bp 691 7- 7066 Syntheticspacersequenceand hairpin loop of HIV gp41 with an added enterokinase cleavage site 
Bp 7067 - 7327 Human prolnsulin GenBank Accession # NM000207 bp 11 7-377 

Bp 7328 - 7372 Spacer DNA, derived as an artifact from the cloning vectors pTOPO Blunt II (Invitrogen) and 
gWIZ (Gene Therapy Systems) 

Bp 7373 - 7724 Synthetic polyp from the cloning vector gWIZ (Gene Therapy Systems) bp 1 920-2271 
Bp 7726 -1 1325 from cloning vector pTnlVICS, bp 3716 - 7315 

pTnMCS(Quail OVep+prepro+ENT+proins+syn polyA) 

Bp 1- 3670 from cloning vector pTnMCS, bp 1 - 3670 

Bp 3676 -4333 Quail Ovalbumin enhancer: 658 bp sequence, amplified from quail genomic DNA, roughly 
equivalentto the far- upstream chicken ovalbumin enhancer, GenBank accession #582527.1 , bp 1 -675. (There 
are multiple base pair substitutions and deletions in the quail sequence, relative to chicken, so the number of 
bases does not correspond exactly.) 

Bp 4340 - 5705 Quail Ovalbumin promoter. 1366 bp sequence, amplified from quail genomic DNA, roughly 

corresponding to chicken ovalbumin promoter, GenBank accession # JC0895-M24999 bp 1-1336. (There are 
multiple base pair substitutions and deletions between the quail and chicken sequences, so the number of 
bases does not correspond exactly.) 

Bp 5712-5882 Cecropin cap site and prepro, Genbank accession # X07404 bp 563-733 

Bp 5889 - 6038 Syntheticspacer sequence and hairpin loop of HIV gp41 with an added enterokinase cleavage site 

Bp 6039 - 6299 Human prolnsulin GenBank Accession # NM000207 bp 117-377 

Bp 6300 - 6344 Spacer DNA, derived as an artifact from the cloning vectors pTOPO Blunt II (Invitrogen) and 
gWIZ (Gene Therapy Systems) 

Bp 6345 - 6696 Synthetic polyA from the cloning vector gWIZ (Gene Therapy Systems) bp 1920 - 2271 
Bp 6698 -10297 from cloning vector pTnlVIGS, bp 3716 - 7315. 

pTnMOD (CMV-prepo-ent-proins-synPA) 

Bp 1- 4045 from vector PTnMCS, bp 1-4045 

Bp 4051 - 5695 CMV promoter/enhancer taken from vector pGWIZ (Gene therapy systems), bp 230-1864 

Bp 5701 -5871 Capsite/prepro taken from GenBank accession # X07404, bp 563-733 

Bp 5879 - 6027Syntheticspacersequence and hairpin loop of HIV gp41 with an added enterokinase cleavage site 

Bp 6028-6288 Human Prolnsulin taken from GenBank accession # NM000207, bp 1 1 7-377 

Bp 6289 - 6333 Spacer DNA, derived as an artifact from the cloning vectors pTOPO Blunt II (Invitrogen) and 

pGWIZ (Gene Therapy Systems) 

Bp 6334 - 6685 Synthetic polyA from the cloning vector pGWIZ (Gene Therapy Systems), bp 1 920-2271 
Bp 6687 -10286 from cloning vector pTnMCS, bp 3716-7315 

pMnMOD(Chicken OVep+OVg'+ENT+proins+syn polyA) 

Bp 1 - 4045 from cloning vector pTnMod, bp 1 - 4045 

Bp 4051 - 4725 Chicken Ovalbumin enhancer taken from GenBank accession # S82527.1 bp 1-675 

Bp 4732 - 6067 Chicken Ovalbumin promoter taken from GenBank accession # J00895-IVI24999 bp 1-1336 

Bp 6074 - 7292 Chicken Ovalbumin gene from GenBank Accession # V00383.1 bp 2-1220. (This sequence 

includes the 5'UTR, containing putative cap site bp 6074-6137.) 

Bp 7299 -7448 Syntheticspacer sequence and hairpin loop of HIV gp41 with an added enterokinase cleavage site 
Bp 7449 - 7709 Human proinsulin GenBank Accession # NM000207 bp 1 17-377 

Bp 7710 - 7754 Spacer DNA, derived as an artifact from the cloning vectors pTOPO Blunt II (Invitrogen) and 

gWIZ (Gene Therapy Systems) 

Bp 7755 - 81 06 Synthetic polyA from the cloning vector gWIZ (Gene Therapy Systems) bp 1 920-2271 
Bp 8108 -11707 from cloning vector pThlVlod, bp 3716 - 7315 

pTnMOD(Chicken OVep-Hprepro-nENT-Hproins+syn polyA) 

Bp 1 - 4045 from cloning vector pTnMCS, bp 1 - 4045 

Bp 4051 - 4725 Chicken Ovalbumin enhancer taken from GenBank accession # S82527.1 bp 1-675 

Bp 4732 - 6067 Chicken Ovalbumin promoter taken from GenBank accession # J00895-M24999 bp 1-1336 

Bp 6074-6244 Cecropin cap site and prepro, Genbank accession # X07404 bp 563-733 
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Bp 6251 -6400 Synthetic spacer sequence and hairpin loop of IHIV gp41 with an added enterol<inase cleavage site 
Bp 6401 - 6661 Human proinsulin GenBank Accession # NM000207 bp 1 17-377 

Bp 6662 - 6706 Spacer DNA, derived as an artifact from the cloning vectors pTOPO Blunt II (Invitrogen) and 
gWIZ (Gene Therapy Systems) 

Bp 6707 - 7058 Synthetic polyA from the cloning vector gWIZ (Gene Therapy Systems) bp 1 920 - 2271 
Bp 7060 -10659 from cloning vector pTnMCS, bp 3716 -7315 

pTnMOD(Quail OVep+OVg'+ENT+proins+syn polyA) 

Bp 1- 4045 from cloning vector pTnMCS, bp 1- 4045 

Bp 4051 - 4708 Quail Ovalbumin enhancer: 558 bp sequence, amplified in-house from quail genomic DNA, 
roughly equivalentto the far-upstream chicken ovalbumin enhancer, GenBank accession #382527.1, bp 1-675. 
(There are multiple base pair substitutions and deletions in the quail sequence, relative to chicken, so the number 
of bases does not correspond exactly.) 

Bp 4715 - 6080 Quail Ovalbumin promoter: 1366 bp sequence, amplified in-house from quail genomic DNA, 

roughly corresponding to chicken ovalbumin promoter, GenBank accession # J00995-IVI24999 bp 1 -1 336. (There 
are multiple base pair substitutions and deletions between the quail and chicken sequences, so the number of 
bases does not correspond exactly.) 

Bp 6087 - 7285 Quail Ovalbumin gene, EMBL accession # X53964, bp 1 -1 1 99. (This sequence includes the 
5'UTR, containing putative cap site bp 6087-6139.) 

Bp 7292 - 7441 Syntheticspacer sequence and hairpin loop of HIV gp41 with an added enterokinase cleavage site 
Bp 7442 - 7702 Human proinsulin GenBank Accession # NM000207 bp 1 17-377 

Bp 7703 - 7747 Spacer DNA, derived as an artifact from the cloning vectors pTOPO Blunt II (Invitrogen) and 
gWIZ (Gene Therapy Systems) 

Bp 7748 - 8099 Synthetic polyA from the cloning vector gWIZ (Gene Therapy Systems) bp 1920-2271 
Bp 81 01 -1 1 700 from cloning vector pTnMCS, bp 371 6-731 5 

pTnMOD(Ouail OVep+prepro+ENT+proins+syn polyA) 

Bp 1-4045 from cloning vector pTnMCS, bp 1 - 4045 

Bp 4051 - 4708 Quail Ovalbumin enhancer. 658 bp sequence, amplified in-housetrom quail genomic DNA, 
roughly equivalent to the far-upstream chicken ovalbumin enhancer, GenBank accession #382527.1, bp 1-675. 
(There are multiple base pair substitutions and deletions in the quail sequence, relative to chicken, so the number 
of bases does not correspond exactly.) 

Bp 4715 - 6080 Quail Ovalbumin promoter: 1366 bp sequence, amplified in-house from quail genomic DNA, 
roughly corresponding to chicken ovalbumin promoter, GenBank accession #J00895-M24999 bp M 336. (There 
are multiple base pair substitutions and deletions between the quail and chicken sequences, so the number of 
bases does not correspond exactly.) 

Bp 6087-6257 Cecropin cap site and Prepro, Genbank accession # X07404 bp 563-733 

Bp 6264 - 641 3 Synthetic, spacer sequence and hairpin loop of IDV gp41 with an added enterokinase cleavage 

site 

Bp 6414 - 6674 Human proinsulin GenBank Accession # NM000207 bp 1 17-377 

Bp 6675 - 6719 Spacer DNA, derived as an artifact from the cloning vectors pTOPO Blunt II (Invitrogen) and 

gWIZ (Gene Therapy Systems) 

Bp 6720 - 7071 Synthetic polyA from the cloning vector gWIZ (Gene Therapy Systems) bp 1920-2271 
Bp 7073 - 1 0672 from cloning vector pTnMCS, bp 371 6 - 7315 

pTnMOD (CMV-prepro-ent-hGH-CPA) 

Bp 1-4045 from vector FTnMOD, bp 1-4045 

Bp 4051 -5694 CMV promoter/enhancer taken from vector pGWIZ (Gene therapy systems), bp 230-1 873 
Bp 5701 -5871 Capsite/Prepro taken fron GenBank accession # X07404, bp 563-733 Bp 5878-601 2 Synthetic 
spacer sequence and hairpin loop of HIV gp41 with an added enterokinase cleavage site 
Bp 6013-6666 Human growth hormone taken from GenBank accession # V00519, bp 1-654 
Bp 6673-7080 Conalbumin polyA taken from GenBank accession # Y00407, bp 10651-1 1 058 
Bp 7082-10681 from cloning vector pTnMOD, bp 4091-7690 

pTnMCS (CHOVep-prepro-ent-hGH-CPA) 



37 



EP 1 592 789 B1 



Bp 1-3670 from vector PTnMCS, bp 1-3670 

Bp 3676-4350 Chicken Ovalbumin enhancer tal<en from GenBank accession # S82527.1 , bp 1 -675 

Bp 4357-5692 Chicken Ovalbumin promoter taken from GenBank accession # J00899-M24999, bp 1-1336 

Bp 5699-5869 Capsite/Prepro taken from GenBank accession # X07404, bp 563-733 Bp 5876-601 0 Synthetic 

spacer sequence and hairpin loop of HIV gp41 with an added enteroklnase cleavage site 

Bp 601 1 -6664 Human growth honnone taken from GenBank accession # V0051 9, bp 1 -654 

Bp 6671 -7078 Conalbumin polyA taken from GenBank accession # Y00407, bp 1 0651 -1 1 058 

Bp 7080-10679 from cloning vector pTnMCS, bp 3716-7315 

pTnMCS (CMV-prepro-ent-hGH-CPA) 

Bpl - 3670 from vector PTnlVICS, bp 1 - 3670 

Bp 3676-531 9 CMV promoter/enhancer taken from vector pGWIZ (Gene therapy systems), bp 230-1 873 
Bp 5326-5496 Capslte/Prepro taken from GenBank accession # X07404, bp 563 - 733 Bp 5503-5637 Synthetic 
spacer sequence and hairpin loop of HIV gp41 with an added enteroklnase cleavage site 
Bp 5638-6291 Human growth hormone taken from GenBank accession # V0051 9, bp 1 -654 
Bp 6298-6705 Conalbumin polyA taken from GenBank accession # Y00407, bp 10651-1 1068 
Bp 6707-10306 from cloning vector pTnMCS, bp 3716-7315 

pTnMOD (CHOVep-prepro-ent-hGH-CPA) 

Bp 1-4045 from vector PTnlVIOD, bp 1-4045 

Bp 4051-4725 Chicken Ovalbumin enhancer taken from GenBank accession # S82527.1, bp 1-675 

Bp 4732-6067 Chicken Ovalbumin promoter taken from GenBank accession # J00899-l\/124999, bp 1 -1 336 

Bp 6074-6244 Capsite/Prepro taken from GenBank accession # X07404, bp 563-733 Bp 6251-6385 Synthetic 

spacer sequence and hairpin loop of HIV gp41 with an added enteroklnase cleavage site 

Bp 6386-7039 Human growth hormone taken from GenBank accession # V0051 9, bp 1 -654 

Bp 7046-7453 Conalbumin polyA taken from GenBank accession # Y00407, bp 1 0651 -1 1 058 

Bp 7455-1 1 054 from cloning vector pTnMOD, bp 4091 -7690 

PTnModfCMV/Transposase/ChlckOvep/prepro/ProteinA/ConpoiyA) 

BP 1-130 remainder of F1 (-) on of pBluescriptli sk(-) (Stragagene) bp 1-130. 

BP 1 33-1 777 CMV promoter/enhancer taken from vector pGWIZ (Gene Therapy Systems) bp 229-1 873. 

BP 1 780-2987 Transposase, modified from Tnl 0 (GenBank #J01 829). 

BP 2988-2933 Engineered DOUBLE stop codon. 

BP 2994-3343 non coding DNA from vector pNK2859. 

BP 3344-3386 Lambda DNA from pNK2859. 

BP 3387-3456 70bp of IS10 left from TnlO. 

BP 3457-3674 multiple cloning site from pBluescriptli sk(-) bp 924-707. 

BP 3675-5691 Chicken Ovalbumin enhancer plus promoter from aTopo Clone 10 maxi 040303 (5' Xmal, 3' 

BamHI) 

BP 5698-5865 prepro with Cap site amplified from cecropin ofpMON200 GenBank* X07404 (5'BamHI, 3'Kpnl) 

BP 5872-7338 Protein A gene from GenBank* J01 786, mature peptide bp 292-1755 (5'Kpnl, 3'Sacll) 

BP 7345-7752 ConPolyAfrom Chicken conalbumin polyA from GenBank # Y00407 bp 10651-11058. (5'Sacil, 

3'Xhol) 

BP 7753-8195 multiple cloning site from pBluescriptli sk(-) bp 677-235. 
BP 81 96-8265 70 bp of iSIO left from Tnl 0. 

BP 8266-8307 Lamda DNA from pNK2859 
BP 8308-9151 noncoding DNA from pNK2859 

BP 9152-1 1352 pBluescriptli sk(-) base vector (Stratagene, INC.) bp 761 -2961 
Appendix A 
[0205] 

SEQ ID N0;1 (modified Kozak sequence) 
ACCATG 
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SEQ ID MO: 2 (pOtlMCS) 

1 ctgacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg cgcagegtga 
61 cogctacact tgccagcgcc ctagcgcecg ctcctttcgc tttcttccct tcctttctcg 
121 ccacgttcgc cggcateaga ttggctattg gccattgcat acgttgtatc oatatcataa 
181 tatgtacatt tatattggct cacgtccaac attaexgcca cgttgacate gattatcgac 
241 tagttattaa tagtaaccaa ttacggggtc attagttcat agcccatata tggagttccg 
301 cgttacataa cttacggtaa atggcccgcc tggccgaccg cccaacgacc cccgcccatc 
361 gacgtcaata atgacgtatg ttcceatagt aacgecaata gggactttcc attgacgtca 
421 acgggtggag tatttacggc aaactgccca cttggcagta catcaagtgt atcatatgcc 
481 aagtacgccc cctattgaeg tcaatgacgg taaatggccc gcctggcatt atgcccagta 
541 catgaectta tgggactttc ctacttggca gtacatctac gtattagtca tcgctattae 
601 catggtgatg cggttttggc agtacatcaa cgggcgtgga tagcggtttg actcacgggg 
661 atttccaagt ctccacccca ttgacgteaa. tgggagtttg ttttggcacc aaaatcaacg 
721 ggaetttcca aaatgtcgta acaacteogc cecacegacg caaatgggcg gtaggcgtgt 
781 aeggtgggag gtctatataa gcagagctcg tctagtgaac cgtcagatcg cctggagacg 
841 ecatccacgc tgttttgacc tccatagaag acaccgggac cgatccagcc tccgcggccg 
901 ggaacggtgc attggaacgc ggattccccg tgccaagagt gacgtaagta ccgcctatag 
961 actctatagg cacacccctt tggctcttat gcatgctata ctgtttttgg cttggggcct 
1021 atacaccccc gcttccttat gctataggtg atggtatagc ttagcctata ggtgcgggtt 
1081 attgaccatt attgaccact cccctattgg tgacgatact ttccattact aatccataac 
1141 atggctcttt gccacaacta tctetaetgg cCatatgcca atactctgtc cttcagagac 
1201 tgacacggac tctgtatttt tacaggacgg ggtcccattt attatttaca aattcacata 
1261 tacaacaacg ccgteccccg tgceegi=agt ttttactaaa catagcgtgg gatctccacg 
1321 cgaatcccgg gtacgtgttc cggacacggg ctcttctccg gtagcggcgg agcttccaca 
1381 tccgagccct ggtcccatgc ctccagegge tcatggtcgc tcggcagctc cttgctcct? 
1441 acagtggagg ccagacttag gcacagcaca atgcceacca ccaccagtgt gccgcacaag 
ISOl gccgtggcgg tagggtatgt gtctgaaaac gagcgtggag attgggctcg cacggctgac 
1561 gcagatggaa gacttaaggc agcggcagaa gaagatgcag gcagctgagt tgttgtattc 
1621 tgataagagt eagaggcaac tcccgttgcg gcgctgttaa cggtggaggg cagtgtagte 
1681 tgagcagtac tcgttgctgc cgcgegcgcc accagacaca atagctgaca gactaacaga 
1741 ctgttccttt ccatgggtcc tttctgeagt cacogtcgga ecatgtgcga actegat^tt 
laoi tcacacgact ctctttacca attctgcccc gaattacaet taaaaogact caacagctta 
1861 acgttggotc gccacgcatt acttgaccgt aaaactctca ebcttaccga aettggeegt 
1921 aacctgccaa ccaaagogag aacaaaacat aacatcaaac gaatcgaecg attgttaggt 
1981 aatcgtcacc tccacaaaga gcgactcgcc gtataccgtt ggeatgctag ccttatctgt 
2041 tcgggcaata cgatgcccat tgtacttgtt gaccggtctg atattcgtga gcaaaaacga 
2101 cttatggtat tgcgagcttc agtcgcacta caeggtcgtt ctgttaetct ctatgagaaa 
2161 gcgttcccgc tttcagagea atgttcaaag aaagctcatg accaatttct agccgacctt 
2221 gcgagcatte taecgagtaa caceacaccg ctoattgtca gtgatgctgg ctttaaagtg 
2281 ccatggtata aatccgttga gaagctgggt tggtactggt taagtcgagt aagaggaaaa 
2341 gtacaatatg cagacctagg agcggaaaac tggaaaccta tcagcaactt acatgaCatg 
2401 tcatctagtc acCcaaagac tttaggctat aagaggctga ctaaaagcaa tccaatctca 
2461 tgccaaattc tattgtataa atctcgctct aaaggcegaa aaaatcageg ctcga cacg g 
252 1 acteatcgtc accacccgcc acctaaaate taetcagcgt eggeaaagga geeatgggtt 
2581 ceagcaacta acttacetgt tgaaattcga aeacccaaac aacCtgttaa tatctattcg 
2641 aagegaatgc agattgaaga aaccttccga gacttgaaaa gtcctgccta cggactaggc 
2701 ctacgccata gccgaacgag cagctcagag egttttgata ccacgctgct aatcgccctg 
2761 atgcttcaac taacatgttg gcttgcgggc gttcatgctc agaaaeaagg ttgggacaag 
2821 eacttccagg ccaacacagt cagaaatcga aacgtactct eaacagttcg cttaggcatg 
2881 gaagttttgc ggcattctgg ctacacaata acaagggaag acttactcgt ggctgcaacc 
2941 ceactagete aaaatttatt cacacatggt tacgctttgg ggaaattatg aggggatcgc 
3001 tetagagega tcogggatcc cgggaaaagc gttggtgacc aaaggtgcct tttatcatca 
3061 ctttaaaaaC aaaaaacaat tactcagtgc ctgttataag cagcaattaa ttatgattga 
3121 tgcctacatc acaacaaaaa etgatttaac aaatggttgg tctgccttag aaagtatatt 
3181 tgaacattat cttgattata ttattgataa taataaaaac cttatcccta tccaagaagt 
3241 gatgcctate attggttgga atgaacttga aaaaaattag ccttgaatac attactggta 
3301 aggtaaacgc cattgtcagc aaattgatec aagagaacca acttaaagct ttcctgacgg 
3361 aatgttaatc ctcgttgacc ctgagcaetg atgaaeccce taatgatttt ggtaaaaacc 
3421 attaagttaa ggtggataca catcttgtca catgateeeg gtaatgtgag ttagctcact 
3481 cattaggcac cccaggcttt acactttatg cttceggctc gtacgttgtg t^aattgcg 
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3S41 agcggataac aacctcacac aggaaacagc cacgaccatg attacgccaa gcgcgcaatt 
3601 aaccctcacc aaagggaaca aaagccggag ctccaccgcg gtggcggccg ctctagaact 
3661 agtggatccc ccgggctgca ggaactcgat atcaagctta Ccgataccgc tgacctcgag 
3721 ggggggcccg gtacccaatt cgccctatag tgagtcgCat tacgcgcgct cactggccgt 
3781 cgttttacaa cgtcgtgacc gggaaaaccc tggcgttacc caacttaatc gccctgcagc 
3641 acatcccccc ttcgccagct ggcgcaatag cgaagaggcc cgcaccgatc gcccttccca 
3901 acagtcgcgc agcctgaatg gcgaacggaa atcgcaagcg ctaatatttt gttaaaaccc 
3961 gcgttaaatt tctgttaaac cagcccattt ttcaaccaat aggccgaaat cggcaaaatc 
4021 ccttataaat caaaagaata gaccgagata gggttgagtg ttgttccagt ttggaacaag 
4081 agtccactac taaagaacgt ggactccaac gtcaaiagggc gaaaaaccgt ctaccagggc 
4141 gatggcccac tactccggga tcatatgaca agatgtgtat ccaccttaac ttaatgattt 
4201 ccaecaaaat caccagggga ctcaceagtg ctcagggcca acgagaatca acatcccgtc 
4261 aggaaagctt atgatgatga tgtgcttaaa aacttactca atggctggtt atgcatatcg 
4321 caacacatgc gaaaaaceta aaagagctcg ccgatasaaa aggccaactt atcgctattt 
43B1 accgcggctt ttcattgagc ctgaaagata aataaaacag ataggtttta tttgaagcca 
4'44l aatctcctte aCcgtaaaaa atgccetcet gggttatcaa gagggtcatc atacttcgcg 
4S01 gaataacatc atttggtgac gaaataacta agcaettgte tcctgtttae tcccctgagc 
4561 ctgaggggtc aacatgaagg ccatcgatag caggataaca atacagtaaa acgctaaacc 
4621 aataatccaa atccagccat cccaaattgg tagtgaatga ttataaataa cagcaaaeag 
4681 taatgggcca ataacacegg ttgcaccggc aaggetcacc aataatccct gtaaagcaec 
4741 ttgctgatga ctctttgttt ggatagacat cactccctgt aatgcaggta aagcgatcce 
4801 aecaccaseo aataaaaeta aaacagggaa aactaaccaa ccttcagata taaacgetaa 
4861 aaaggeaaat gcactaccab-etgcaataaa tecgagcagt actgeegttt tttcgeeeat 
4931 ttagtggeta ttettcctge eaeaaaggct tggaatactg agtgtaaaag aceaagacec 
'4981 gtaacgaaaa gccaaccatc atgctatcca tcatcacgat cectgeaata gcaceaeaee 
5041 gtgetggatt ggetateaat gegetgaaat aataatcaae aaatggcatc gttaaataag 
, 5101 tgatgtatae cgatcagett ttgtteccte tagtgagggt caattgegeg cttggcgtaa ' 
'5161 teatggteat agctgtttcc tgtgtgaaat tgttatccge teacaattcc acacaacata 
5231 egageeggaa gcataaagtg taaageetgg ggtgeetaab gagtgagcta actcacatca 
52B1 actgogtege gctcaeegte egetttccag tegggaaacc tgtcgtgeea gctgeattaa 
5341 tgaateggee aaegogeggg gagaggeggt ttgegtattg ggegetet'te cgoctceteg 
5401 eteaeegaot egeegegetc ggtcgttogg ctgoggcgag eggtaceaga tcacteaaag 
54'61 goggtaatae ggetatceac agaateaggg gataacgeag gaiuigaaeat gtgageaaaa 
'SS21 ggeeageaaa aggeeaggaa ecgtaaaaag gcegegtege tggcgttttt ceacaggece 
SS81 cgccccectg acgagcacca caaaaatcga cgctcaagte agaggcggcg aaaceogaea 
5641 ggactataaa gataccaggc gtttcccccc ggaagcteec ecgtgcgctc tecegtteog 
5701 aecetgeogc ttaccggaea cctgtccgcc tttctccctt cgggaagcgc ggcgctttct 
5761 catagcccac getgtaggca teteagtccg gtgtaggtog ttegctecaa gctgggctgt 
5821 gcgcacgaac eccccgttca gcccgaccgc tgegcctcat ccggtaacta tegcettgag 
5681 -tccaacccgg taagaeacga cetatcgcca ctggcagcag ccacCggtaa caggattagc 
S941 agagcgaggt atgtaggcgg tgctacagag ctcctgaagt ggtggeotaa ctacggctae 
6001 actagaagga cagtatttgg cacctgcgcc ctgctgaage cagttacctc cggaaaaaga 
6061 gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt tttcgttcgc 
6121 aagcageaga ccacgcgcag aaaaaaagga cctcaagaag atcctttgac cttttctacg 
61B1 gggtctgacg cccagtggaa cgaaaactca cgttaagggs ttttggtcac gagattatca 
6241 aaaaggatct tcacetagac cctttcaaac taaaaatgaa gtttcaaatc aatccaaagt 
6301 atatatgagt aaacctggtc tgacagctac caacgcccaa ccagtgaggc acccatctea 
6361 gcgatctgtc tatttcgttc acccacagtc gcccgactcc ccgtcgtgta gataactacg 
6421 atacgggagg gctcaccacc tggccccagc gccgcaacga caccgcgaga cccacgctca 
6481 ccggetccag atttatcagc aataaaccag ccagccggaa gggccgagcg cagaagtggt 
6541 ccCgcaactt tatccgcccc catccagtct attaattgtt gccgggaagc tagagcaagc 
6601 agtbcgccag ttaacagttt gcgcaacgtc gttgccattg ccacaggcat cgtggtgtca 
6661 cgctcgccgt ttggtatggc ctcattcagc tccggtcccc aacgatcaag gcgagttaca 
6721 tgacccccca tgctgtgcaa aaaagcggtc agccccttcg gtcctccgat cgtcgtcaga 
6781 agtaagctgg ccgcagtgtt atcaetcatg gttatggcag cactgeataa Ctctcttact 
6841 gtcatgccat ccgcaagatg cttttctgtg actggtgagt actcaaccaa gtcattctga 
6901 gaatagtgta tgeggcgacc gagttgctct tgcccggcgt caatacggga taataccgcg 
6961 ccacatagca gaactttaaa agtgctcatc attggaaaac gttcttcggg gcgaaaactc 
7021 Ccaaggatct taccgctgct gagatccagt tcgatgtaac ccactcgtgc acccaactga 
7081 tettcagcat cttttacttt caccagcgtt tctgggtgag caaaaacagg aaggcaaaat 
7141 gccgcaaaaa agggaataag ggcgacacgg aaatgttgaa tactcatact cttccttbtt 
7201 caatattatt gaagcattta tcagggttat tgtctcatga gcggatacat atttgaatgt 
7261 atttagaaaa ataaacaaat aggggttccg cgcacatttc cccgaaaagt gccae 



SBQ ID MO i a (pTnMod) 
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CTGAC6C6CC CT6TA6CGGC GCATTAAGCG CGGC6GGTGT GOTGOrTACG SO 
CGCAGC6TQA CC6CTACACT TGCCAGCGCC CTAGCGCCCG CTCCTTTCGC 100 
TTTCTTCCCT TCCnTCTOS CCACGTTCX3C CGGCATCAGA TTGGCTATTG 150 
GCCATTGCAT ACGTTGTATC CATATCATAA TATGTACATT TATATTGGCT 200 
CATGTCCAAC ATTACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA 250 
TAGTAATCAA TTACGGGGTC ATTAGTTCAT AGCCCATATA TGQA6TTCCG 300 
CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACQACC 350 
CCCGCCCATT GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA 400 
GGGACTTTCC ATTGACGTCA ATGGGTGGAG TATTTACGGT AAACT6CCCA 450 
CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTACGCCC CCTATTGAOG 500 
TCAATGAC6G XAAATSGCCC GCCTG6CATT ATGCCCAGTA CATGACCTTA S50 
T6G6ACTTTC CTACTTGGCA GIACATCTAC 6TATTA6TCA TCGCTATTAC 600 
CATGGXGAIG 060TTXTG6C A6TACATCAA TGGGCGIGQA TA60S6TTTS 650 
ACTCACGGG6 ATTTCCAAQT CTCCACCCCA TTQACGTCAA TGGGAGTTT3 700 
TTTTGGCACC AAAATCAACG GGACTTTCCA AAATGTCQTA ACAACTCCGC 750 
CCCATTOACG CAAATGGGCG GTAGGCGTGT ACGGTGGGAG GTCTATATAA 800 
GCA6AGCTCG TTTAGTGAAC CGTCAGATCG CCTGGAGACG CCATCCAC6C 850 
TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCCGCGGCCG 900 
GGAAOGGTGC ATTGGKACGC GGATTCCCCQ TGCCAAGAGT GACGTAAGTA 950 
CCGCCTATAG ACTCTATAGG CACACCCCTT TGGCICTtAT GCA1GCTATA lOOCi 
CTGTTTTTG6 CTXGGGGCCT' ATACACCCCC GCTTCCTXAT GCTATA6GI6 1050 
ATGG»CATA6C TTASCCIATA GGTGTGGGTT ATTGACCArT ATTGACCACT 1100 
CCOCTAnCG TSACGATACT TTCCATTACT AATCCATAAC ATGGCTCTTT 1150 
GCPACAACXA TCT C aiTTOG .CTATA^GCCA ATACTCTGTC CTTCAGAGAC ' 1200 
TGACACG6AC TCTGDUTl'l' .. TACAGQATGP GGTCCCATTT ATTATTTACA 1250 
AATTCACATA TACAACAACG . CCGTCCCCCG TGCCCGCAGT TTTTATTAAA 1300 
CATAGCGXGG GATCTCCACG CGAATCTCG6 GTACGTGTTC CGGACATGGG 1350 
CXCTTCTCCG GTAGCGGCGG AGCTTCCACA TCCGAGCCCT GGTCCCATGC 1400 
CTCCAGCGGC TCATGGTCGC TCGGCAGCTC CTTGCTCCTA ACAGTG6AGG 1450 
CCAGACTTAO GCACAGCACA ATGCCCACCA CCACCA6TGT GCCGCACAAG 1500 
GCCGTGGCGG TAGGCTIATGT GTCTGAAAAT GAGCGTGGAG ATTGGGCTCG 1550 
CACX3GCTGAC GCAGATGGAA GACTTAAGGC AGCGGCAGAA GAAGAT6CAQ 1600 
GCAGCX6AQT TGTTGTATTC TQATAAGAGT CAGAOGTAAC TCGGGTTG06 1650 
GTGCTGI73VA GGGTGQAGGG. CAGTGTA6TC TGA6CASTAC TCGRGCTQC 1700 
C60GCGC6CC ACCA8ACATA ATA6CTGACA GACTAACAGA CTGITCCTTT 1750 
CCAT6GGICT TTTCOXSCAGT CACCGTCGGA CCATGI61X3A ACTTGATATT 1800 
TTACAXGATX CTCTTTAOCA ATTCT6CCCC GAATTACACT TAAAA06ACT 1850 
CMCK3CTTA ACGTTGGCTT 6CCACGCATT ACTTGACTGT AAAACTCTCA 1900 
CTCTTACCGA ACTTGeCOOT AACCTGCCAA CCAAAGCGAG AACAAAACAT 1950. 
AACATCAAAC GAATCGAOSS ATTGTTAGGT AATCGTCACC TCCACAAAGA 2000 ' 
GCGACTCGCT GTATACCXSTT GGCATGCTAG CTTTATCTGT TCGGGAATAC 2050 
GATGCCCATT GTACTTGTTG ACTG6TCTGA TATTCGTGAG CAAAAACGAC 2100 
TTATG6TATT GCGAGCTTCA GTCGCACTAC ACGGTCGTTC TGTTACTCTT 2150 
TATGAGAAAG C6TTCCCGCT TTCA6AGCAA TGTrCAAAGA AA6CTCATQA 2200 
CCAATTTCTA GCCGACCTT6 CGA6CATTCT ACC6A6TAAC ACCACACCGC 2250 
TCATTGTCAG TGATGCXGQC TTTAAAGTGC CATG6TATAA ATCCGTTGAB 2300 
AAGCTGGGTT (36TACTGGTT AASTCGASTA AGAGGAAAAO TACAATATGC 2350 
AQACCTAGGA GCG6AAAACT GGAAACCIAT CAGCAACTTA CATGATATGT 2400 
CATCTA8TCA CTCAAAGACT TTAGGCTATA AGAGGCTGAC TAAAAGCAAT 2450 
CCAATCTCAT GCCAAATTCT ATTGTATAAA TCTCGCTCTA AAGGCCGAAA 2500 
AAATCAGCGC TCGACACG6A CTCATTGTCA CCACCCGTCA CCTAAAATCT 2550 
ACTCAGCGTC GGCAAiMSGAG CCATGGGTTC TAGCAACTAA CTTACCTGTT 2600 
GAAATTCGAA CACCCAAACA ACTTGrTAAT ATCTATTCGA AGC6AATGCA 2650 
GATTGAAGAA ACCTTCCQAG ACTTGAAAAG TCCTGCCTAC GGACTAG6CC 2700 
TACGCCAXAG CCGAACGAGC AGCTCA6A6C 6TTTTGATAT CATGCT6CTA 2750 
AT06CCCTGA TQCTTCAACT AACAT6TT66 CTTGCGGGCG TTCAT8CTC3V 2600 
GAAACAAGGT TGGSMMGC ACTTCCAGGC TAACACAGTC AGAAATCGAA 2S50 
ACGTACTCTC AACAGTTCGC TXAGGCATGG AAGXTTTGCG GCATTCTG6C -2900 
TACACAATAA CAAGGGAAGA CTTACTCGTG GCTGCAACCC TACTAGCTCA 2950 
AAATTTATTC ACACATGGTT ACGCTTTGGG GAAATTATGA TAATGATCCA 3000 
GATCACTTCT GGCTAATAAA AGATCAGAGC TCTAGAGATC TCTGTGTIGG 3050 
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TTTTTTGTGG ATCTQCTGTG CCTTCTAGTT GCCAGCCATC TCTTGTTTGC 3100 
CCCTCCCCCG TGCCTTCCTT GACCCTGGAA C3GIGCCACTC CC3VCTGTCCT 3150 
TTCCTAIITAA AATGAGGAAA TTGCATOGCA TTGTCTGAaT AGOTeTCATT 3200 
CTATTCT6GG G66TGGGGT6 GGGCAGCACA GCAAG6GGGA OQATTGGGAA 3250 
6ACAATA6CA GGCATGCTGG CX3ATGCGGT6 GGCTCTATGG GTACCTCTCT 3300 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CTCTCGGTAC CTCTCTCTCT 3350 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CGGTACCAGG TGCTGAAGAA 3400 
TTGACCCGGT OACCAAAGGT GCCTTTTATC ATCACTTTAA AAATAAAAAA 3450 
CAATTACTCA GTGCCTGTTA TAAGCAGCAA TTAATTATGA TTGATGCCTA 3500 
CATCACAACA AAAACTGATT TAACAAATGG TTGC3TCTGCC TTAGAAAGTA 3550 
TATTTGAACA TTATCTTGAT TATATTATTG ATAATAATAA AAACCTTATC 3600 
CCTATCCAAQ AAGTQATGCC TATCATTGGT TGGAATGAAC TTGAAAAAAA 3650 
TTAOCCTTGA ATACATTACT GGTAAQ6TAA AOSCCATrST CAGCAAATT6 3700 
ATCCAAGAGA ACCAACTTAA AGCTTTCCTG ACGQAATBTT AATTCTCGTT 37S0 
GACCCTGAGC ACTGAT6AAT CCCCTAATQA TTTT6GTAAA AAXCATTAAG 3800 
TTAAG6TGQA TACACATCTT GTCATAXGAT CCCGGTAATO TOASTTAGCT 3850 
CACTCATTAG GCACCCCAGG CTTTACACTT TATGCTTCCG GCTCGTATGT 3900 
TGTGTGGAAT TGTGAGOGGA TAACAATTTC ACACAGGflAA CAGCTATGAC 3950 
CATGATTACG CCAAGCGCGC AATTAACCCT CACXAAAGGG AACAAAAGCT 4000 
GGAGCTCCAC CGCGGTGGCG GCCGCTCTAG AACTAGTGGA TCCCCCGGGC. 4050 
TGCAGQAATT CGATATCAAG CTTATCGATA CCGCTQACCT CGAGGQQGGG 4100 
CCOGGTACCC AATTCGCCCT ATAGTGAGTC GTATTAOOCG CGCTCACTSO 4150 
CCCrrCGTTTT ACAACGTCGT GACTGGGAAA ACCCT6GCGT TACCCAACTT 4200 
AAT06CCTT0 CA6CACATCC CCCTTTCGCC AGCTGG06TA ATAGCGAA6A 4250 
Q6CCCX3CAOC 6AT0GCCCTT CCCAACAGTT GOSCAGCCTQ AAlGGCGAAT 4300; 
GGAAATT6TA A6CGTTAATA- TTTTGTTAAA ATTCGCGXIA AATTmGTT 4350 
AAATCAGCTC AXnrmAC CAATAGGCC6 AAATCGGCAA AATCCCTTAT 4400 

■AAATCAAAAG AATAGACCGA GATAGGGTTG AGTCTTGTTC CAGTTTGGAA 4450 
CAAGAGTCCA CTATTAAAGA ACGTGGACTC CAACGTCAAA GG6CGAAAAA 4500 
CCGTCTATCA GGGCGATGGC CCACTACTCC GGGATCATAT GACAAQATGT 4550 
GTATCCACCT TAACTXAATG ATTTTTACCA AAATCATTAG GGOATTCATC 4600 

.AGTGCTCAGG GTCAACGAGA ATTAACATTC C6TCAGGAAA GCTTATGATG 4650 
ATGATGTGCX TAAAAACTTA CTCAAtGGCT GGTTATGCAT ATCGCAATAC 4700 
ATGCGAAAAA CCTAAAAGA6 CTTGCCGATA AAAAAG6CCA ATTTATTGCT 4750 
ATTTAC0608 GCCXTTXATT GAQCTTGAAA 6ATAAATAAA ATAGATAGGT 4800 
T TTA T T T 6 AA GCTAAA.TCTT CTTTATGGTA AAAAATGCCC TCTTGGGTTA 4850 
TCAAGAiSGGT CATTATATTT 060G6AATAA CATCATTTSG TQAC6AAATA 4900 
ACTAAQCACT TGTCTCCTGT TTACTCCCCT GAGCTTGAGG G6TZAACAT6 4950 
AAGSrCATCG ATAGCAG6AT AATAATACAG TAAAACGCTA AAipCAATAAT SOOO 
CCAAATCCAS CCATCCCAAA TTGGTAGTGA ATGATTATAA ATAACAGCAA 5050 
ACAGTAATGG GCCAATAACA CCGGTTGCAT TGGTAAGGCT CACCAATAAT 5100 
CCCTGTAAAG CACCTTGCTG ATGACTCTTT GTTTGGATAG ACATCACTCC 5150 
CTGTAAT6CA GGTAAAGCGA TCCCACCACC AGCCAATAAA ATTAAAACAG 5200 
GQAAAACTAA CCAACCTTCA GATATAAACG CTAAAAAGGC AAATGCACTA 5250 
CTATCTSCAA TAAATCCGAG CAGTACTGCC GTTTTTTCGC CCATTTAGTG 5300 
GCTATTCXTC CIGCCACAAA GGCTTGGAAT ACTGAGT6TA AAAGACCAAG 5350 
ACCC6TAAT6 AAAAGCCAAC CATCATGCTA TTCATCATCA CGATXTCTGT 5400 
AATAGCACCA CACC6TGCT6 GATTGGCTAT CAATGCGCTG AAATAATAAT 5450 
CAACAAATGQ CATC6TTAAA TAAGTSATCT AIACOQATCA GCTTTTGTTC 5500 
CCTTTAQTGA QG6TTAATTG CGCGCTTGGC GTAATCATGG TCATAGCTGT 5550 
TTCCTGTGTG AAATTGTTAT CCGCTCACAA TTCCACACAA CATACGA6CC 5600 
GGAAGCATAA AGTGTAAAGC CTGGGGTGCC TAATGACTGA GCTAACTCAC 5650 
ATTAATT6CG TTGCGCTCAC TGCCCGCTTT CCAGTCGGGA AACCTGTCGT 5700 
GCCAGCT6CA TIAATGAATC GGCCAACGCG CGGGGAGAGG CGGTrTGCGT 5750 
ATTGGGCGCT CTTCOGCTTC CTCGCTCACT 6ACTCGCTGC GCTCGGTCGT 5800 
TCG6CTGCQG CQASOGSTAT CA6CTCACTC AAAGGCGGTA ATAC^TTAT 5850 
CCACA6AATC AG6G6ATAAC GCAGGAAAGA ACATGTGA6C AAAAGGCCAG 5900 
CAAAASGCCA GGAACCGTAA AAAGGCCGCG TTGCTGGCGT TTTTCCATAG 5950 
GCTCC6CCCC CCIGACGAGC ATCACAAAAA TCXSACGCTCA A6TCASAGGT 6000 
GGCGAAACCC GACAaOACTA TAAAGATACC AGGCGmCC CCCTGGAAGC 6050 
TCCCTCGTGC 6CTCTCCTGT TCCGACCCTG CC0CTTACC6 GATACCrGTC 6100 
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CGCCTTTCTC CCTTCGGGAA GCOTGGCGCT TTCTCATAQC TCACGCTGTA 6150 
G6TATCTCA0 TTCG6T6TA6 GTCGTTCGCT CCAAGCTGGG CTGT6TGCAC 6200 
GAACCCCCOQ TTCAGCCC6A CCGCTGOGCC TTATCCGGTA ACTATCGTCT 6250 
TQASICCAAC CC66TAAGAC ACGACTTATC GCCACTGGCA GCAGCCACTG 6300 
GTAACAGGAT TAGCAGAGC6 AGGTATGTAG GOGGTGCTAC AGAGTTCTTG 6350 
AAGTGGTGGC CTAACTACX3G CTACACTAGA AG6ACAGTAT TTGGTATCTG 6400 
CGCTCTGCTG AAGCCAGTTA CCTTCGGAAA AAGAGTTGGT AGCTCTTGAT 6450 
CCGGCAAACA AACCACCGCT GGTAGCGGTG GTTTTTTTGT TTGCAAGCAG 6500 
CAGATTACGC GCAGAAAAAA AGGATCTCAA GAAGATCCTT TQATCTTTTC 6550 
TACGGGGTCT GACGCTCAGT GGAACGAAAA CTCACGTTAA GGGATTTTGG 6600 
TCAT GAGAT T ATCAAAAAGG ATCTTCACCT AGATCCTTTT AAATTAAAAA 6650 
TBAAGrrrTA AAICAATCTA AA6TATATAT GABTAAACTT GGTCTGACAG 6700 
TTACCAATGC TTAATCACT6 AG6CACCTAT CTCA6CGATC TGTCZATTTC 6750 
GTTCATCCAT AGTXGCCTGA CTCCCCGTC6 TOTAGATAAC TACGATACGQ 6800 
QAGG6CTTAC CATCTG6CCC CAGTGCT6CA ATGATACC6C GAGACCCAOQ 6850 
CTCACCGGCT CCAGATTTAT CAGCAATAAA CCAGCCAGCC GGAAGGGCCG 6900 
AGCGCAGAAG TG6TCCT6CA ACTTTATCCG CCTCCATCCA GTCTATTAAT 6950 
TGTTGCCGG6 AAGCTAGAGT AAGTAGTTCG CCAGTTAATA GTTXGCGCAA 7000 
CGTTQTTGCC ATTGCTACAG GCATCGTGGT GTCACGCTCG TCGTTTGGTA 7050 
TGGCTTCATT CAGCTCCGGT TCCCAACGAT CAAGOCGAGT TACATGATCC 7100 
CCCATGTTGT GCSiAAAAAGC GGTTAGCTCC TTCGGTCCTC CGATC6TTGT 7150 
CAGAA6TAAG TTGGCCGCAS TGTTATCACT CAT66TTATC GCA6CACTGC 7200 
ATAATTCTCT TACT6TCATG CCATCC6TAA GATQCTTTTC TOTGACTGOT 7250 
QAGXACTCRA CCAAGTCArr CT6A6AATA6 T6TATG0GGC GACCGAGTTO 7300 
CTCTTGCCC6 606TCAATAC GGGAIAATAC CGCGCCACAT AGC3U3AACTT 73S0 
TAAAAGTGCT CATCATTGGA AAACGTTCTr CGGGGCGAAA ACTCTCAAGG 7400 
ATCTTAC08C TGTTeASATC CAGTT06AT6 TAACCCACTC CTGCACCCAA 7450 
CTGATCTTCA GCATCTTTTA CTTTCACCAG CGTTTCTGGG TGAGCAAAAA 7500 
CAGGAAGGCA.AAATGCCGCA AAAAAGGGAA TAAGGGCGAC ACGGAAATGT 7550 
TGAATACTCA TACTCTTCCT TTTTCAATAT TATT6AAGCA TTTATCAGG6 7600 
TTATTGrCTC ATGAGCG6AT ACATATTTGA ATGTATTTA6 AAAAATAAAC 7650 
AAATAGGGGT TCCGC6CACA TTTCCCOGAA AAGI6CCAC 7689 



SEQ ID N0:4 {a Kozak sequence) 
ACCATGG 



SEQ ID NO: 5 (a Kozak sequence) 
ACCATGT 



SEQ ID N0:6 (a Kozak sequence) 
AAGATGT 



SEX ID N0:7 (a Kozak sequence) 
ACGATGA 



SEQ ID N0:8 {a Kozak sequence) 

AAGATGG 



SEQ ID N0;9 (a Kozak sequence) 
GACATGA 

SEQ ID NO:10 (a Kozak sequence) 
ACCATGA 



SEQ ZD 110:11 (conalbumin polyA) 

tctgceattg ctgcttectc tgcccttcct cgtcactctg aatgtggctt ettogctact 
gccacagcaa gaaataaaat ctcaacatct aaacgggttt cctgaggttt ttcaagagtc 



43 



EP 1 592 789 B1 



gttaagcaca ttccttcccc agcacccctt 
ctactgctgc ccacgagaga aatccagttc 
tgccctagat cctgattaac aggcgtttgt 
atceeattgc ccccc 



gctgcaggcc agtgccaggc accaacttgg 
aatactcccc aaagcaaaat ggattacata 
attatctagt gctttcgctc cacccagatt 



SEQ ID 110:12 (synthetic polyA) 

GGCGCCTCGATCCAGATCACTTCTGGCTAATAAAACIATCAGAGCTCTAGAGATCTC 
TGTGGATCTGCTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTCXrCCCTCCC^^ 
CT(«aUW«?KSCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGC^^ 
TGTCATTCTATTCrcGGGGCTGGGGTGC^CAGCACAGCAAGGOSGAGGATTaX.^^ 

TCTOGGTACCTCTCTC 



SEQ ID K0:13 (avian optimized polyA) 

ggggatcgc tctagagcga 
teegggatet egsgaaaagc 
gttggtgaec aaaggtgcct 
tttatcatca ctttaaaaat 
aaaaaacaat tactcagcgc 
ctgttataag cagcaattaa 
ttatgattga tgcctacatc 
acaacaaaaa ctgatttaac 
aaatggttgg tctgccttag 
aaagtatatt tgaacattat 
cttgattata ttattgataa : <' 
taataaaaac cttatcccta 
tccaagaagt gatgcctatc 
attggttgga atgaactbga 
aaaaaattag ccttgaabac 
attactggta aggtaaacgc 
cattgtcagc aaaccgatcc 
aagagaacca a 
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SBQ ID N0il4 
(vitellogenin promoter) 

TGAATCTGTT CTTGTGTTAT 
O^TATAAAT CACAGTTAGT 
GATGAAGTTG GCTGCAAGCC 
T6CATCAGTT CAGCTACTTG 
6CTGCATTTT GTATTTGGTT 
CTGTAG6AAA TGCAAAAGGT 
TCIAGQCTGA CCTGCACTTC 
TAVCCCTCTT 6CCTTACTGC 
TGAGAATCTC T6CAGGTTTT 
AATTGTTCAC ATTTTGCTCC 
CATTTACTTT GGAAGATARA 
ATATTTACAG AATGCTTATG 
AAACCTTTGT TCATTTAAAA 
ATATTCCTGG TCAGCGTGAC 
C6GAGCTGAA AGAACACATT 
GATCCCGTGA TTTCAATAAA 
XACATATOTT CCAIATATTG 
TTTCTCAGTA GCCTCTTAAA 
TCATGSIGCGT T66T6CACAT 
AT6AA7ACAT GAATAGCAAA 
G6TTTATCT0 GATTACGCTC 
TQGCCTSCA6 GAATGGCXAT 
AAACCAAAGC TGAG6GAAGA 



GGGAGAGTAT 
QATTAtACZG 
GGGTTATTAT 
ACAACTTGGQ 
GGTCAACATA 
AACCAGTCTC 
GGACCATGTA 

Gccxnmccc 

AGCAA6TAGC 
TAARTTTATT 
TAGTAQAAGT 
GATACATTQA 
CAATCAQAAA 
ATCAGAQKT6 
ATTTCSATTTT 
C6TGAAGACA 
GCAAAAAGAG 
ATAAACT6AT 
AGGAATTCAO 
CCACGT6TTC 
TTCCATAAAA 
GCCIGGCAGA 
CCTTCGCT 



AGTCAATGTA 
ATTGCTQATT 
CAQCTAGATA 
TCAiGGTGCCA 
ACCT6G6CAA 
ATCTGTGGCA 
CCAGCAGCCA 
AATCTAGGAA 
ACATCAATTT 
GTAAATGCCG 
GITITACTGT 
AACTTCTGGT 

AAGGTmrr 

CCAAGGTATT 
CTTTATTOGC 
ATTTATGATT 
GA(3TGTTTAC 
AAAAAACTTG 
CAGAAAACAG 
CTGAACATTC 
GTCTCACCAT 
6CCCTATTCA 
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SBQ ID MO; 15 (fragment of ovalbumin promoter - chicken) 
GAGGTCAGAAT GGTTTCTTTA CTGTTTGTCA ATTCTATTAT TTCAATACAG 
AACAATAGCT TCTATAACTG AAATATATTT GCTATTGTAT ATTATGATTQ 
TCCCTCGAAC CATGAACACT CCTCCAGCTG AATTTCACAA TTCCTCTGTC 
ATCTGCCAGG CCATTAAGTT ATTCATGGAA GATCTTTGAG GAACACTGCA 
AGTTCATATC ATAAACACAT TTGAAATTGA GTATTGTTTT QCATT6TATG 
GAGCTATOrr TTGCTGTATC CTCAGAAAAA AAGTTTGTTA TAAAGCATTC 
ACACCCATAA AAAGATAGAT TTAAATATTC CAGCTATAG6 AAA6AAAGT6 
CGTCTGCTCT TCACTCTAOT CTCAGTTGGC TCCTTCACAT GCATOCTTCT 
TXATTTCTCC TATTTTGTCA AGAAAATAAT AGGTCACGTC rTGSTTCTCAC 
TTATGTOCre CCTAGCATG6 CTCAGATGCA C6TTGTA6AT ACAAGAAGGA 
TCAAAT6AAA CAGACTTCTQ GTCTGTTACT ACAACCATAG TAATAAGCAC 
ACTAACTAAT AATTGCTAAT TATGTTTTCC ATCTCTAAGG TTCCCACATT 
TTTCTGTTTT CTTAAAGATC CCATTATCTG GTTGTAACTG AAGCTCAATG 
QAACATGAGC AATATTTCCC A6TCTTCTCT CCCATCCAAC AGTCCTGATG 
GATTAGCAGA ACAGGCAGAA AACACATTGT TACCCAGAAT TAAAAACTAA 
TATTTGCTCT CCATTCAATC CAAAATGGAC CTATTGAAAC TAAAATCTAA 
CCCAATCCCA TTAAATGATT TCTATGGCGT CAAAG6TCAA ACTTCTSAAG 
GQAACCTOIO GGIGG6TCAC AATTCAGGCT ATATATTCCC CAGGGCTCA6 



SBQ ID MO: IS (chicken 
ccgggctgca gaaaaatgcc 
cttgacctga tacctgattt 
cagagagaaa ccatcactga 
attcatctgt gacctgagca 
atgaaaaggc aatttccaca 
tgctccttcc taatgCcaaa 
gtaggteeea gegactggat 
ttttggataa aaagtgcttt 
Cggtttaggg aeagacccac 
ctgacctttt cttgggacaa 
ttgcacagct gtgctgggca 
gcaagaagat tgttgcttac 



ovalbumin ebancer) 
a99tggacta tgaaetcaca 
tcttcaaact ggggaaacaa 
tggctacagc accaaggtat 
aaatgatcta cctctccatg 
ctcacaatat gcaacaaaga 
attgtagtgg caaagaggag 
aagaggcttt gacctgtgag 
tataactttc aggtctccga 
aatgaaatgc ctggeatagg 
gcattgtcaa acaatgtgtg 
gggcaatcca ctgecaccta 
tctccctaga 



tceaaaggag 

cacaateeca caaaacagct 
gcaatggcaa tccattcgac 
aatggttgcc tctttccctc 
caaacagaga acaattaatg 
aacaaaatct caagttctga 
Ctcacctgga cttcatatcc 
gtctttattc atgagaetgt 
aaagggcagc agagccttag 
affiAaactat ttgtactgcb 
tcccaggtaa ccttccaact 



SEQ 10 N0sl7 (5* untranslated region) 
GTGGATCAACATACAGCTAGAAAGCa^ATTGCCTTOAGCACTCAAGCTC^^ 



SEQ ID N0:1 8 (putative cap site) 

ACATACAGCTAG AAAGCTGTAT TGCCTTTAGC ACTCAAGCTC AAAAGACAAC TCAGAGrrcA 
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SBQ IV NO: 19 {ChiOcen Ovalbumin Signal Sequence) 

AT6 GGCTCCftTCG GCGCAGCAAG CATGGAATTT TGTTTIGATG lATTCAAGSA GCTCAAASTC 
CACCATGCCA ATSAGaACAT CTTCTACTGC CCCATTQCCA TCATGTCASC TCTAGCCAT8 
OTATACCTGC GXGCAAAAGA CAGCACCAGQ ACACAGATAA AOAAGGTTGT TC6CTTTGAT 
AARCTTCCAG QATTCGGAGA CAGTATTGAA GCTCAGTGTG GCACATCTGT AAACGTTCAC 
TCTTCACTTA GJVSACATCCT CAACCAAATC ACCAAACCAA ATQATGTTTA TTCGTTCAGC 
CTTGCCAGTA GACTTTATGC TGAAGAGA6A TACCCAATCC TQCCAGAATA CTTGCAGTGT 
6T6AA6QAAC TGTATAGAGG AGGCTTGGAA CCTATCAACT TTCAAACAGC TGCAGATCAA 
GCCAQAG2^ TCATCAATTC CTSG6TAGAA AGTCAGACAA ATGGAATTAT CAGAAATGTC 
CTTCAGCCAA GCTCCGTGGA TTCTCAAACT GCAATGGTTC TGGTTAATGC CATTGTCTTC 
AAAGGACTGT GG6AGAAAAC ATTTAAGGAT 6AAQACACAC AAQCAATGCC TTTCAGAGTQ 
ACT6AGCAAS AAAGCAAACG TGTGCAGATG ATGTACCAGA TrGGTTTATT TA6A6TGGCA 
TCAAT6GCTT CTSAQAAAAT GAAGATCCT8 GA6CTTCCAT TXGCCAGIGO GACAATQA6C 
ATGTTG6TSC TSTT6CCTGA TGAAGTCTCA 66CCTTGA6C AGCTIGAGAG TATAATCAAC . 
TTTGAAAAAC TGACTGAAT6 GACCAGTTCT AATGTTATGG AAGAGAGGAA GATCAAA6TG 
TACTTACCTC GCATGAAQAT GGAGGAAAAA TACAACCTCA CATCTGTCTT AATGGCTATG. 
GGCATTACTG ACGTGTTTAG CTCTTCAGCC AATCTGTCTG GCATCTCCTC AGCAGAGAGC 
CT6AAGATAT CTCAAGCTGT CCATGCAGCA CATGCAGAAA TCAAT6AAGC AGGCAGA6AG 
6TGGTAGGGT CAGCAGAGGC TGGAGTGGAT GCTGCAAGCG TCTCTGAAGA ATTTAGGGCT 
GACCATCCAT TCCTCTTCTG TATCAAGCAC ATC6CAACCA ACGCCGTTCT CTICTTTGaC 
AQATGIGTTT CCCCT 



SEQ ID NO:20 (Chicken Ovalbumin Signal Sequence - shortened 50bp) 

ATG GGCTCCATCG GCGCAGCAAG CATGGAATTT TGTTTTGATG TATTCAAGGA 



SBQ ID 110i21 (Chicken Ovalbumin Signal Sequence - shortened lOObp) 
ATG GGCTCCATCG GCGCAGCAAG CATGGAATTT TGTTTTSATG TATTCAAGGA GCTCAAASTC 
CACCATGCCA ATGA6AACAT CTTCTACTGC CCCATTGCCA 



SEQ ID NO:22 (vitellogenin targeting sequence). 

ATGAGGGGGATCATACTGGCATTAGTGCTGACCCTTGTAGGCAGCCAGAAGTTTGACATt'GGT 



SBQ ZD NO: 23 (pro- insulin sequence) 

TTTGTGAACCAACACCTGTGCGGCTCACACCTGGTGGAAGCTCTCTACCTAGTGTGCGGGGAACGAGGC 
TTCTTCTACaCACCCAAGACCCGCCGGGAGGCAGAGGACCTGCAGGTGGGGCAGGTGGAGCTGGGCGGG 
GGCCCI6GT6CA66CA6CCTGCAGCCCTTG6CCCT6GhGGG6TCCCTGCA6AAGC6T66CATTGT8GAA 
CAATGCXQIACCnSCATeTQCTCCCTCTACCAGCTGGAGAACTCTGCAACTAO 



SEQ ID NO:24 (p146 protein) 
KYKKALKKLAKLL 

SEQ ID NO:25 (p146 coding sequence) 
AAATACAAAAAAGCAGTGAAAAAACTGGCAAAAGTGCTG 

SEQ ID NO:26 (spacer) 

(GPGG), 

SEQ ID NO:27 (spacer) 
GPGGGPGGGPGG 

SEQ ID NO;28 (spacer) 
GGGGSGGGGSGGGGS 

SEQ ID NO:29 (spacer) 
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GGGGSGGGGSGGGGSGGGGS 



SEQ ID NO:30 (repeat domain in TAG spacer sequence) 
Pro Ala Asp Asp Ala 



SEQ ID N0:31 (TAG spacer sequence) 

Pio Ala Asp Asp Ala Pro Ala Asp Asp Ala Pro Ala Asp Asp Ala Pro Ala Asp Asp 
Ala Pro Ala Asp Asp Ala Pro Ala Asp Asp 

SEQ ID NO:32 (gp41 epitope) 

Ala Tlir Thr Cys He Leu Lys Gly Ser Cys Gly Trp He Gly Leu Leu 

SEQ ZD NO I 33 (polynucleotide sequence encoding sp41 epitope) 

Pro Ala Asp Asp Ala Pro Ala Asp Asp Ala Thr Thr Cys lie Leu Lys Oly 
Ser Cys Gly Trp zie Oly Leu Leu Asp Asp Asp Asp Lys 



SEQ ID NO:34 (enterokinase cleavage site) 
DDDDK 



sisQ ID NO:35 (TAG sequence) 

Pro Ala Asp Asp Ah Pro Ala Asp Asp Ala Pro Ala Asp Asp Ala Fro Ala Asp Asp 
Ala Pro Ala Asp Asp Ala Pro Ala A^ Asp Ala Hit Tlir Cys He Leu Lys Gly Ser Cys 
Gly Tip He Gly Leu Leu Asp Asp Asp Asp Lys 



SEQ ID NO:36 (altered transpcsase Hef forward primer) 
ATCTCGAGACCATGTGT0AACTTGATATTTTACA3GATTCTCTTTACC 

SEQ ID NQ:37 (altered transposase Her reverse primer) 
GATTGATCATTATCATAATTTCCCCAAAGCGTAACC 

SEQ ID NO;38 (Xho I restriction site) 
CTCGAG 

SEQ ID NO;39 (Bel I restriction site) 
TGATCA 

SEQ ID NO:40 (CMVf-NgoM IV primer) 
TTGCCGGCATCAGATTGGCTAT 

SEQ ID N0:41 (Syn-polyAr-BstE II primer) 
AGAGGTCACCGGGTCAATTCTTCAGCACCTGGTA 
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SBQ ZD MO:42 (pTDModKOval/ENT tag/Proins/PA) - Chicken) 
CTGACGCGCC CTGTAGOGGC GCATTAA6CQ OGGCGGGTGT GGTGGTTAC6 50 
C6CAGCGTQA CCGCTACACT TGCCA6CGCC CTA6C6CCC6 CTCCTTTCGC 100 
TTTCTTCCCT TCCTTTCTCa CCACGTTCGC CGGCATCAGA TrGGCTATTQ 150 
GCCATTGCAT ACGTTGTATC CATATCATAA TATGTACATT TATATTGGCT 200 
CATGTCCAAC ATTACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA 250 
TAGTAATCAA TTACGGGGTC ATTAGTTCAT AGCCCATATA TGGAGTTCCG 300 
CGTTACATAA CTTACGGTAA AT6GCCCGCC IGQCTGACC6 CCCAACGACC 350 
CCCGCCCATT GACGTCAATA AT6ACGTATG TTCCCATAGT AACGCCAATA 400 
GGGACTTTCC ATTGACGTCA ATGGGTG6AG TATTTACGGT AAACTGCCCA 450 
CTTG6CA03K CATCAAGTGT ATCATAT6CC AAGXACGCCC CCTATTGACG 500 
TCAATGACG6 TAAATQQCCC 6CCTG6CATT AT6CCCAGTA CATGACCTTA 550 
TOGQACTTTC CTACTTC6CA GTACATCIAC 6TATTAGTCA TCGCTATTAC 600 
CAT6STQAXG CGGTTTIGGC AGTACATCAA TGGGCGIGCA TAGOOGTTTC 650 
ACTCA086G6 ATTTCCAA6T CTCCACCCCA TTQACGTCAA TGGQASTTTG 700 
TZrreGCACC AAAATCAACG GGACTTTCCA AAATGTCGXA ACAACTCCXSC 750. 
CCCATTQACG CAAATGGGCG GTAGGCGTGT ACQGTGGGAQ GTCTATATAA 800 
GCAGAGCTCQ TTTAGTGAAC CGTCAGAXCG CCTGGAGACX3 CCATCCACQC 850 
TGTTTT3ACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCCGCSGCCG 900 
GGAACGGTGC ATTQQAACGC GGATTCCCCG TGCCAAGAGT GACQTAAGTA 950 
CCGCCTATAG ACTCTATAGG CACACCCCTT TGGCTCTTAT GCATGCTATA 1000 
CTGTTTTTGQ CTTGGGGCCT ATACACCCCC GCTTCCTTAT GCTATAGGTO 1050 
ATGGTATAGC TTAGCCTATA GGTOTGGGTT ATTGACCATT ATTGACCACT 1100 
CCCCTATTQO TGACGATACT TTCCATTACT AATCCATAAC ATGGCTCTXX 1150 
GCCAC3MCTA TCTCTAnGG CTATATGCCA ATACTCTSTC CTTCAOAQAC 1200 
TGACaUSSGAC TCTGTATTTT TACAGGAT60 GGTCCCATTT ATTATTTACA 1250 
AATTCACATA TACAACAACO CCGTCCCCC6 TGCCCGCAGT TTTTATTAAA 1300 
CATAGCeTGG GATCTCCAC6 CGAATCTCGG GTACGT6TTC CGQACAT6GG 1350 
CTCTTCTCCQ GTAGCGGCG6 AGCTTCCACA TCOGAGCCCT G6TCCCATGC 1400 
CTCCAGCGGC TOVTGGTCGC TCGGCAGCTC CTTGCTCCiTA ACA6TGGAGG 1450 
CCAGACTXAQ GCaCAGCACA ATGCCCACCA CCACCAGTGT GCCGCACAAG 1500 
GCCGTGGCGO TAGGGTAOGT GTCTGAAAAT GAGCGTGGAG ATTGGGCTCG 1550 
CACG6CT6AC GCAGATGGAA GACTTAAGGC AGCGGCAGAA GAAGATGCAG 1600 
GCAGCTGA6T TGTTGTATTC TGATAAGAGT CAGAGGTAAC TCCCGTTGCG 1650 
GTbCTGTXAA C66TGGAGGG CAGTGTAGTC TGAGCAGTAC TCGTIGCTGC 1700 
C6C6C60GCC AOCAOACATA ATAGCTGACA GACTAACAGA CTGTTCCTTT 1750 
CCATGOOFCT TTTCT6CA6T CACCGTCGQA CCATSTGIQA ACTTGATATT 1800 
TTACATSATT CTCITIACCA ATTCTGCCCC GAATTACACT TAAAAC8ACT 1850 
CAACAGXrrXA AC6TTCGCTT GCCAC6CATT ACTTQACrGrr AAAACXCTCA 1900 
CTCTTACC6A ACTTGGCCGT AACCTGCCAA CCAAAGCQAG AACAAAACAT 1950 
AACATCAAAC GAATCGACCG ATTGTTAGGT AATCGTCACC TCCACAAAGA 2000 
GCGACTCGCT 6TATACCGTT GGCATGCTAG CTTTATCTGT TCGGGAATAC 2050 
GATGCCCATT GTACTTGTTG ACTGGTCTGA TATTCX3TGAG CAAAAACGAC 2100 
TTATGGTATT GCGAGCTTCA GTCGCACTAC ACGGTCGTTC TGTTACTCTT 2150 
TATGAGAAAG CGTTCCCGCT TTCA6AGCAA TGTTCAAAGA AAGCTCATGA 2200 
CCAATTTCXA 6C06ACC7TG CGAGCATTCT ACC6AGTAAC ACCACACCGC 2250 
TCATIGTCAB TGftTGCTGGC TTTAAAGTGC CATGGTAXAA ATCCGTTGAG 2300 
AAGCTSGSTT G6XACT6GTT AAGT06A6TA AGASGAAAAG TACAATATSC 2350 
AGACCTABGA GC6GAAAACT GGAAACCTAT CAGCAACTTA CATGATAT6T 2400 
CATCTA6TCA CTCAAAGACT TTAGGCTATA AGAGGCTGAC TAAAAGCAAT 2450 
CCAATCTCAT GCCAAATTCT ATTGTATAAA TCTOGCTCTA AA6GCC6AAA 2500 
AAATCA6CGC TCQACACGGA CTCATTGTCA CCACXXX3TCA CCTAAAATCT 2S50 
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ACTC3MKX3TC GGCAAAGGAG CCATGGGTTC TAGCAACTAA CTTACCTGTT 2600 
GAAATTC6AA CACCCAAACA ACTTGTTAAT ATCTATTCGA AGCGAATGCA 2650 
GATTGAAGAA ACCTTCCGAG ACTTGAAAAG TCCTGCCTAC GGACTAGGCC 2700 
TACGCCATAG CCGAACGAGC AGCTCAGAGC GmTGATAT OITGCTGCTA 2750 
ATCGCCCXCa TGCTTCaACT AACATGTTGG CTT6CGG6CG TTCavrGCTCA 2800 
GAAACAAGGT T6GGACAAGC ACTTCCAG6C TAACACAOTC AGAAATCGAA 2850 
AC6TACTCTC AACAOTTCGC TTAGGCATGG AAOTmGCO GCATTCTQ6C 2900 
TACACAATAA CAAGGGAAGJl CTTACTOSTG GCTGCAACCC TACTAQCTCA 2950 
AAATTTATTC ACACAT6GTT AOGCTTTGGG GAAATTATGA ZAATQATCCA 3000 
GATCACTTCT GGCTAATAAA AGATCAGAGC TCTAGAGATC T6TQTGTTGG 3050 
TTTTTTGTGG ATCTGCTGTG CCTTCTAGTT GCCAGCCATC TGTTGTTTGC 3100 
CCCTCCCCCG TGCCrrCCTT GACCCTGGAA GGOXaCCACTC CCACTSTCCT 3150 
TTCCTAATAA AATGAGGAAA TTGCATCGCa TTGTCTGAGT AGGTGTC3VTT 3200 
CTATTCTGG6 GGGTGGGGTQ GGGCAGCACA GCAAGGGGGA GGATT6GGAA 3250 
6ACAATAGCA GGCATGCTGG GGATGCGGTG GGCTCTATGQ GTACCTCTCT 3300 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CTCTCGCTAC CTCTCTCTCT 3350 
CTCrCTCTCT CTCTCTCTCT CTCTCTCTCT CGGTACCAGQ T6CT6AA6AA 3400 
TTOACCCGCT GUiCCAAAGGT GCCTTTXATC ATCACTTTAA AAATAAAAAA 3450 
CAATTACTCA 6TGCCTGTTA TMGCAGCAA TTAATTAIGA TTCSATGCCTA 3500 
CATCACAACA AAAACTGATT TAACAAATG6 TTGGTCT6CC TTAGAAAGTA 3550 
TATTTGAACA TTATCTTGAT TAXATTATTO ATAATAATAA AAACCTXATC 3600 
CCTATCCAAG AAGTGATGCC TATCATTGGT TGGAAT6AAC TTGAAAAAAA 3650 
TTAGCCTTGA ATACATTACT GGTAAGGTAA ACGCCATTGT CAGCAAATTC 3700 
ATCCAAGAGA ACCAACTTAA AGCTTTCCT6 ACGGAATGTT AATTCTCGTT 3750 
GACCCTGAGC ACT6ATGAAT CCCCTAATGA TrTTGGTAAA AATCATTAAG 3800 
TTAAGGTGGA TACACATCTT GTCATATGAT CCCGGTAAT6 TGAGTTAGCT 3850 
CACTCATTAG GCACCCCAGG CTTTACACTT TATGCTTCCG GCTCGTATGT 3900 
IGTGT6QAAT TGTGA6CGGA TAACAATTTC ACACAGGAAA CAGCTATQAC 3950 
CATGATTAC6 CCAAGCGCGC AATTAACCCT CACTAAAGGS AACAAAAGCT 4000 
G6A6CTCCAC CGCGGTGGOG GCCGCTCTAG AACIAGT6GA TCCCCCGGGO 4050 
AGOTCAGMT G6TTTCTTTA CTGTTTGTCA ATTCTATTAT TTCAATACAG 4100 
AACAATASCT TCTATAACTG AAATATATTT GCTATTQTAT AXTATSATT6 4150 
TCCCTC6AAC CAT6AACACT CCTCCAGCIO AATTTCACAA TTCCTCTGTC 4200 
ATCTQCCA6Q CCATTAAQIT ATICATGGAA GATCTT TGAfl QRACACTGCA 4250 
AGTTCATATC ATAAACACAT TTGAAATT6A GTATTGTTTT GCATTGTAXO 4300 
GAGCTATGTT TTGCTGTATC CTCAGAAAAA AAGTTTGTTA TAAAGCATTC 4350 
ACACCCATAA AAAGATAGAT TTAAATATTC CAGCTATAGG AAAGAAAGTG 4400 
C6TCTGCTCT TCACTCTAGT CTCAGTTGGC TCCTTCACAT GCATGCTTCT 4450 
TTATTTCTCC TATTTTGTCA AGAAAATAAT AGGTCACGTC TTGTTCTCAC 4500 
TTATGTCCTG CCTAGCATGG CTCAGATGCA CGTTGTAGAT ACAAGAAGGA 4550 
TCAAATGAAA CAGACTTCTG GTCTGTTACT ACAACCATAG TAATAAGCAC 4600 
ACTAACTAAT AATTGCTAAT TATGTTTTCC ATCTCEAAGG TTCCCACATT 4650 

vrscsesm cttaaagatc ccattatctq gttgzaactg aagctcaats 4700 

SAACATGAOC AATATTTCCC AQTCTTCTCT CCCATCCAAC A8TCCTGAT0 4750 
OATTAGCAGA ACAGGCAGAA AACACATTCT TACCCAGAAT TAAAAACZAA 4800 
TATTT6CTCT CCATTCAATC CAAAATG6AC CXATTGAAAC TAAAATCTAA 4850 
CCCAATCCCA TTAAATGATT TCTATGGOQT CAAA6GTCAA ACTTCTGAA6 4900 
GGAACCTGTG GGTGGGTCAC AATTCAGGCT ATATATTCCC CAOOGCTCAS 4950 
CGGATCCATG G6CTCCAXCG GC6CAGCAAG CATGGAATTT 7GTTTTGATG 5000 
TATTCAAG6A 6CTCAAA6TC CACCATGCCA ATGAGAACAT CTTCTACTGC 5050 
CCCATTGCCA TCATGTCAGC TCTAGCCATG 6TATACCTGG 6TGCAAAAGA 5100 
CAGCACCAGQ ACACAGATAA ATAAGGTTGT TCGCTTTGAT AAACTTCCAG 5150 
GATTGGOAGA CA6TATTGAA GCTCAGTGTG GCACATCTGT AAAOG TTCAC 5200 
TCTTCACTTA GAGACA.TCCT CAACCAAATC ACCAAACCAA ATGAI6TTTA 5250 
TTCQTTCAGC CTTGCCAGPrA GACTTTATGC TGaUVGAGAGA TACCCAATCC 5300 
TGCCAOAATA CTT6CA6I6T GT6AAGGAAC T6IATA6A66 AG6CITGGAA 5350 
CCTATCAACT TTCAAACAGC TGCAGATCAA GCCA6AGAGC TCATCAATTC 5400 
CTGG6TAGAA AGTCAGACAA ATGGAATTAT CAGAAATGTC CTTCA6CCAA 5450 
GCTCCGTGGA TTCTCAAACT GCAATGOTTC TGGTTAATGC CATTGTCTTC 5500 
AAAGGACTGT GGQAGAAAAC ATTTAAG6AT GAAGACACAC AAGCAATGCC 5550 
TTTCA6AGT6 ACTGAGCAAG AAAGCAAACC TGTGCAGATG ATGXACCAGA 5600 
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TTGGrCTATr TUGMSTGQO^ TCA^TGGCTT 
6A6CTTCCAT TTGCCMmxS GACAATGAGC 
TQAAGPrCTOl GGCCnOAGC AGCTTGAOAG 
TSACrSAATS 6ACCA6TTCT AAT6TTAT66 
TACTTACCTC GCATGAAGAT GGAGGAAAAA 
AATGGCTATQ GGCATTACTG ACGTGTTTAG 
GCATCTCCTC AGCAGAGAGC CTGAAGATAT 
CATGCAGAAA TCAATGAAGC AGGCAGA6AG 
TGGAGTGGAT GCTQCAAGCQ TCTCTGAAGA 
TCCTCTTCTG TATCAAGCAC ATCGCAACCA 
AGATGTGTTT CCCCTCCGCG GCCAGCAGAT 
ACCAOCAGAT QAC6CACCA6 CAGATGACGC 
CAOATGAOGC AACAACATGT ATCCT6AAAG 
CIGCT6QA3Q ACGATGACAA ATTTGTQAAC 
CCT66TGQAA OCTCTCTACC TAGT6TGCGG 
CACCCAAGAC CCGCCGGGAS GCAGAGGACC 
CTGGGCGGGG GCCCTGGTGC AGGCAGCCTG 
GTCCCTGCAB AAQCGTGGCA TTGTGGAACA 
CCCTCTACCA GCTGGAGAAC TACTGCAACT 
ACTTCTGGCT AATAAAAGAT CAGAGCTCTA 
TTGTGGATCT GCTGTGCCTT CTAGTTGCCA 
CCCCCGTGCC TTCCrrGACC CTGGAAGGTG 
TAATAAAAIQ AQ6AAATTGC ATCGCATTGT 
TCTGSGGGQF 6G0GTGGG6C AGCACAGCAA 
ATACCAQGCA TQCtaaGGAT acOGTOQGCT 
CTCTCTCTCI CTCTCTCTCT CTCTCTCTCT 
G6CCC6GTAC CCAATXCGCC CTATAGIGAO 

GGCcsToen mcAACorc gtgactggga 

TTAAT060CT TGCAGCACAT CCCCCTTTCO 
6AGGCCC6CA CCGATCGCCC TTCCCAACAG 
ATGGAAATTG TAAGCXxTTAA TATTTTGTTA 
TTAAATCAGC TCATTTTTTA ACCAATAGGC 
ATAAATCAAA AGAATAGACC GAGATAGGGT 
AACAA6ASIC CACTATTAAA GAACGTGGAC 
AACCOTCTAT CAGGGCGATG GCCCACTACT 
GTGTAtCCAC CTTAACTTAA TGATTTTTAC 
TCAGTGCTCA GGGTCAACGA GAATTAACAT 
TBKSBKSSBS CITAAAAACT TACTCAATGG 
ACAT80QAAA AACCXAAAAQ AGCTTQCCQA 
CXATTXACCG OGGCTTTTTA TTGAGCTTGA 
GTTTTATTTO AAGCTAAATC TTCnTATCG 
TATCAAOAGQ QXCATTATAT TTGGCGQAAT 
TAACXAAQCA CTTGTCTCCT GTTTACTCCC 
TGAAGOTCAT OSATAGCAGG ATAATAATAC 
ATCCAAATCC AGCCATCCCA AATTGGTAGT 
AAACAGTAAT GGGCCAATAA CACCGGITGC 
ATCCCTGTAA AGCACCTTGC TGATGACTCT 
CCCTGTAATG CAGGTAAAGC GATCCCACCA 
AGGGAAAACT AACCAACCTT CAGATATAAA 
TACTATCTGC AATAAATCCG AGCAG3ACTG 
QTGGCTAXTC TTCCTGCCAC AAAGGCTTGG 
AAGACCC6CT AAIGAAAAGC CAACCATCAT 
TTTCGSZAAA ZAGCAaXSVC ACCGTIGCGG 
6CT6AAAAAT AAATAATCAA CAAAA1GGCA 
TATACCQAAT TCAGCTTTTG TTCCCTTTAG 
GGC6TAATCA TGGTCATAGC TGrrTCCTGT 
CAATTCCACA CAACATACGA GCCGGAAGCA 
GCCTAATGAG TGAGCTAACT CACATTAATT 
TTTCCAGTOG GGAAACCTGT CGTGCCAGCT 
GCGCGGGGAG AGGCGGTTTG CGTATTGGGC 
ACTGACTOGC ZGC6CTC66T CGTTOGGCTG 



CT6AGAAAAT GAAGATCCT6 5650 
ATGTTGGTOC TGITGCCTGA 5700 
TATAATCAAC TTRSAAAAAC 57S0 
AAGAGAGGAA 6ATCAAAOTG 5800 
TACAACCTCA CATCTGTCTT 5850 
CTCTTCASCC AATCT6TCTQ 5900 
CTCAAGCTGT CCATGCAGCA 5950 
GTGGTAGGGT CAGCAGAGGC 6000 
ATTTAGGGCT GACCATCCAT 6050 
ACOCCGTTCT CTTCTTTGGC 6100 
GACGCACCAG CAGATGACGC 6150 
ACCAOCAGAT GACGCACCAG 6200 
GCTCTTSTGO CTG6ATCG6C 6250 
CAACACCIGT Q08GCTCACA 6300 
GGAACGAGGC TTCTTCTACA 6350 
TGCAGGTGGO GCAGGTGGA6 6400 
CAGCCCTTGG CCCTGGAGG6 6450 
ATGCTGXACC AGCATCTGCT 65O0 
AGGGCGCCra 6ATCCAGATC 6550 
GAGATCTGTS TGTTG6TTTT 6600 
GCCATCTGTT GTTTGCCCCT 6650 
CCACTCCCAC TGTCCTTTCC 6700 
CTGAGTAGGT GTCATTCTAT 6750 
GGGGGAG6AT TG6GAAGACA 6800 
CTATOaSXAC CTCICTCTCT 6850 
CGGTACCICT CTCQAQGQGO 6900 
TCGTAT1A06 CGOGCTCACT 6950 
AAACCCTQ6C GTTACCCAAC 7000 
CCAGCTQQOS TAATAG06AA 70S0 
TIGCGCAGCC TGAATGGCGA 7100 
AAATTC6C6T TAAATTTTTG 7150 
CGAAATCGGC AAAATCCCTT 7200 
TGA6TGTTGT TCCAGTTTGG 7250 
TCCAACGXCA AAGGGCGAAA 7300 
CCGGGATCAT ATGACAA6AT 7350 
CAAAATCATT AGGGGATTCA 7400 
TCC6TCAGGA AAGCTTATQA 7450 
dGGTTATGC ATATCGCAAT 7500 
TAAAAAAG6C CAATTTATTG 7550 
AAGAZAAATA AAATAGATAO 7600 . 
TAAAAAAT6C CCTCTT6GGT 7650 
AACATCATTT GGIGACQAAA 7700 
CTGAGCTT6A GGGGTTAACA 7750 
AGTAAAACGC TAAACCAATA 7800 
6AATQATTAT AAATAACA6C 7850 
ATTGGTAAGG CTCACCAATA 7900 
nGTTXGGAT AGACATCACT 7950 
CCAGCCAATA AAATTAAAAC 8000 
CGCTAAAAAG GCAAATGCAC 8050 
CCGTTTTTTC GCCCCATTTA 8100 
AATACTGAGT OXAAAAGACC 8150 
GCTATTOCAT CCAAAACQAT 8200 
aAATTTGGCC TATCAATTGC 8250 
TCGTTrCAAA tAAAGTQATG 8300 
TGAGGGTTAA TIGCGCGCTT 8350 
GS^rAATTGT TATCCGCTCA 8400 
TAAAGTGTAA AGCCTGGG6T 8450 
GCGTTGCGCt CACTGCCCGC 8500 
GCATTAATGA ATCGGCCAAC 8550 
GCTCTTCOGC TXCCTCGCTC 6600 
066CGAGCGG TATCA6CTCA 86S0 
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CTCAAIVGGC6 GTAATACGGT TATCCACAGA 
AGAACAT6TG AGCAAAAGGC CAGCAAAAG6 
GC6TTGCTS0 CGTTTTTCCR TAG6CTCCGC 
AAATC6AC6C TCAAGTCA6A GGTQGCGAAA 
ACCAiSGCGTT TCCCCCTGGA AGCTCCCTCG 
CTGCCGCTTA CCGOATACCT GTCCGCCTTT 
GCTTTCTCAT AGCTCACGCT GTAGGTATCT 
GCTCCAAGCT GGGCTOTGTQ CACGAACCCC 
GCCTTATCCG GTAACTATCG TCTTGAGTCC 
ATCGCCACTO GCAGCAGCCA CTGGTAACAG 
TAGGCGGTGC TACAGAGTTC TTGAAGTGGT 
AGAAGGACAG TATTTGGIAT CT6CGCTCTQ 
AAAAAGAGTT GGTASCTCTT GATCCGQCAA 
GTGGTTTTTT TGTTTGCAAG CAGCAGATTA 
CAA<3AAGATC CTTIGATCTT TTCTACGGG6 
AAACTCAC6T TAAOGSATTT TSGTCATGAG 
CCTAGATCCT TTTAAATTAA AAATGAAGTT 
TAT6AGTAAA CTTG6TCTQA CAGTTACCAA 
TATCTCAGCXS ATCTGTCTAT TTCGTTCATC 
TCGTGTAGAT AACTACGATA CGGGAGGGCT 
GCAATGATAC CGCGAGACCC ACGCTCACCG 
AAACCAGCCA GCCGGAAGGG CCGAGCGCAG 
CCGCCTCCAT CCAGICTATr AATTGTTGCC 
TGGCCASTTA ATAGTTI606 CAACGTTOIT 
G6TGXCA06C TOGTOSmG GTATG6CTTC 
GATCAAOGOG AGmCKTOA TCCCCCATGT 
TCCTTOSeTC CTCCGATCGT TGTCAGAAGT 
ACTCATGGTT ATGGCAGCAC TGCATAATTC 
TAAGATCCTT TTCTGTGACT GGTGAGTACT 
TAGTGTATGC GGCGkCCGAG TTGCTCTTGC 
TACCGCGCCA CATAGCKQAA CTTTAAAAGT 
CTTCGGGGCQ AAAACTCTCA AGGATCTTAC 
AT6TAACCCA CTCGTGCACC CARCTGATCT 
CA6CGTTTCT GGGTGAGCAA AAACAGGAAG 
GAATAA6GGC GACACGGAAA TQTTGAATAC 
TATTATTOAA GCATTTATCA GGGTTATTGT 
TSAATGXATT TA6AAAAAXA AACAAATAGQ 
6AAAAGXGGC AC 



ATCAGGGGAT AACGCAGQAA S700 
CCAGQAACCG TAAAAAGGCC 8750 
CCCCCTGACG AGCATCACAA 8800 
CCCGACAG6A CTATAAAGAT 88S0 
TGCGCTCTCC TGTTCCGACC 8900 
CTCCCTTCGG 6AAGCGTGGC 8950 
CAGTTCGGTG TAGGTCGTTC 9000 
CCGTTCAGCC CGACCGCTGC 9050 
AACCCGGTAA GACACGACTT 9100 
GATTAGCA6A GC6AGGTAT6 9150 
GGCCTAACTA CGGCIACACT 9200 
CTGAAGCCAG TTACCrTCQG 9250 
ACAAACCACC GCTGGTAGCG 9300 
CGCGCAGAAA AAAAG6ATCT 9350 
TCTGAC6CTC AGTGGAACGA 9400 
ATTATCAAAA AGGATCTTCA 9450 
TTAAATCAAT CTAAAGTATA 9500 
TGCTTAATCA GTGAGGCACC 9550 
CATAGTTGCC TGACTCCCCQ 9500 
TACCATCTGG CCCCAGTGCT 9650 
GCTCCAGATT TATCAGCAAT 9700 
AAGTGGTCCT GCAACTTTAT 9750 
GGGAAGCTAG AOTAAGTA6T 9800 
GCCATTGCTA CAGGCADGGT 9850 
ATTCAGCTCC GGTTCCCAAC 9900 
TGT6CAAAAA AGOGGTrAGC 9950 
AAGTTGGCCG CAGT6TTATC 10000 
TCTTACTGTC ATGCCSVTCCG 10050 
CAACCAAGTC ATTCTGAGAA 10100 
CCGGCGTCAA TACGGGATAA 10150 
GCTCATCATT GQAAAACGTT 10200 
CGCTGTTGAG ATCCA6TTCG 10250 
TCAGCATCTT TTACTTTCAC 10300 
GCAAAATGCC GCAAAAAAGQ 10350 
TCATACTCTT CCTTTTTCAA 10400 
CTCATQAGCG GATACATATT 10450 
GGTTCCGCSC ACATTTCCCC 10500 

10512 



SEQ ID NO:43 (pTtiHOD (CHV-CHOVg-ent-ProZxisulia-syDPA) ) 

1 ctgaogcgec etgbagcsge geattaagcg cggcsggtgt ggcggttacg cseagogtga 
61 ccgctaeaet tgccagcgcc ctagegccog ctcctttcgc tttcttccct tcctttctog 
121 ccacgttege cggcatcaga ttggetattg gecattgcat acgttgtatc catatcat;aa 
181 tatgtacatt Catattgget catgteeaae abtacegcca tgttgacatt gattattgac 
241 tagttattaa Cagtaatcaa ttacggggte attagttcat agcecatata tggagttceg 
301 cgttacataa cttacggtaa atggcccgec tggctgaccg cccaacgaec cccgcceatt 
361 gacgtcaata atgacgtatg ttcccatagt aacgecaata gggactttce attgaogcca 
421 atgggtggag tatttacggt aaactgccca cttggcagta catcaagtge ateatatgce 
481 aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt acgceeagta 
541 catgacctta tgggaettte ctactfcggca gtacatctae gtattagtea togetaCtac 
601 catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg acteacgggg 
661 attfcccaagt ctccacecca ttgacgtcaa tgggagtttg ttttggeaee- aaaateaacg 
721 ggactttcca aaacgtcgca acaactccgc cccactgacg caaatgggeg gtaggcgtgt 
781 acggtgggag gtccatataa gcagagctcg tttagtgaac cgtcagatcg cctggagaog 
841 ccatccacgc tgccttgaee tccacagaag aeaecgggae cgacccageo tcegcggccg 
901 ggaacggtgc atcggaacgc ggattecccg Cgccaagagc gacgcaagca cegcetatag 
961 actctatagg cacacccctt tggctcttat gcacgctata ctgtttttgg cttggggcct 
1021 atacaccccc gcttccttat gctacaggtg atggtatagc ctagcctata ggtgtgggtt 
1081 attgaccatt attgaccact cccctactgg tgacgatact ttccattact aatceataac 
1141 atggctcttt gccacaacca tctctattgg ctatatgcca atactctgtc cttcagagac 
1201 tgaeacggae tetgtatttt tacaggatgg ggteccatet atCaettaea aatccacata 
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1261 tacaaeaacg ccgtcccccg tgcccgcagt 
1321 cgaacctcgg gtacgtgctc cggacatggg 
1381 tecgagccct ggtcccatgc ctccagoggc 
1441 acagtgsagg ccagacttag gcacagcaea 
ISOl gccgtggcgg tagggtatgt gtctgaaaat 
ISSl geagatggaa gacttaagge agcggcagaa 
1621 tgataagagt cagaggtaae teecgetgeg 
1681 tgagcagtac tcgttgccgc egegegcgce 
1741 etgttccttt ccatgggtet eccctgcagt 
1801 ttacatgatc ctctteacca actetgeecc 
1861 acgttggett geeaegeatt aettgaetgt 
1921 aacccgccaa ccaaagcgag aacaaaacat 
1981 aatcgtCBCc eccacaaaga gcgacccgct 
2041 tcgggcaaea egatgcccat cgtacctgtc 
2101 cttatggtat Cgcgagettc agtcgcacta 
2161 gogttcccgc ttccagagca atgtccaaag 
2221 gcgagcattc taecgagtaa caccacaceg 
2281 ccatggtata aacccgttga gaagctgggt 
2341 gtacaatatg cagacctagg agcggaaaac 
2401 tcatctagte actcaaagac tttaggctat 
2461 tgecaaattc tattgtataa atctcgctct 
2S21 actcattgtc accacccgtc acctaaaatc 
2S81 ctagcaacta acttacctgt tgaaattcga- 
2641 aagcgaatgc agattgaaga aaccttccga 
2701 c.tacgccata gccgaacgag cagcccagag 
2761 atgcttcaac taacacgttg gcttgcgggc 
2821 cacttccagg ctaacacagt cagaaatcga 
2881 gaagttttgc ggcattctgg ctacacaata 
2941 ctaetagctc aaaatctatt caeaeatggt 
3001 agatcacttc tggctaataa aagatcagag 
3061 gatctgctgt gccttctagt tgccagccat 
3121 tgaecctgga aggtgccact cceactgtcc 
3181 attgtetgag taggtgtcat tctattctgg 
'3241 aggattggga agacaatagc aggcatgctg 
3301 tctctctete tctetctctc tctctctcte 
3361 tctctctcte tctetctctc tcggtaccag 
3421 tgecttttat catcacttta aaaataaaaa 
3481 attaattatg attgatgect aeatcacaae 
3541 ettagaaagt atatttgaac attatettga 
3601 eeetatecaa gaagtgatgc etateattgg 
'3661 aataeattac tggtaaggta aacgceactg 
3721 aagctttcct gacggaatgt taattctegt 
3781 Bttttggtaa aaatcattaa gttaaggtgg 
3841 gtgagttage tcacteatta ggcaccceag 
3901 ttgtgtggaa ttgcgagcgg ataacaattt 
3961 siccaagogcg caattaacee tcactaaagg 
4021 ggcogctcta gaactagtgg accccccggg 
4081 ttgtatecat atcataatat gtacattcat 
4141 tgaeattgat tattgactag ttatcaatag 
4201 ccatatatgg agttccgcgt taeataactt 
4261 aacgaccccc gcccattgac gtcaataatg 
4321 aetttccatb gacgtcaatg ggtggagtat 
4381 caagtgtatc atatgccaag tacgeeccct 
4441 tggeattatg cccagtacat gaccttatgg 
4501 ttagteaceg ctattaccat ggtgatgcgg 
4561 cggtttgact caeggggatt tccaagtctc 
4621 tggcaccaaa atceuicggga ctttccaaaa 
4681 atgggcggta ggcgtgtacg gtgggaggtc 
4741 cagatcgcct ggagacgcca tccacgctgt 
4801 tccagcctcc gcggccggga acggtgcatt 
4861 gtaagtaccg cctatagact ctataggcac 
4921 tttttggctt ggggcctata caccccpgct 
4981 gcctataggt gtgggttatt gaccattatt 
5041 cattactaat ccataacatg gctctctgcc 
SlOl ctctgtcctt cagagaccga cacggactct 
5161 atttacaaat tcacatacac aacaacgcc^ 
5221 agcgtgggat ctccacgcga atctcgggta 
5281 gcggeggagc ttccaeatee gagccctggt 



ttttattaaa catagcgtgg gatctccacg 
ctctcctccg gtagcggcgg agcttccaca 
tcatggtcgc tcggcagctc cttgctccta 
atgcccacca ccaccagtgt gccgeacaag 
gagcgtggag attgggctcg cacggctgae 
gaagatgcag gcagcegagt tgttgtatte 
gtgcegttaa cggtggaggg cagtgtagtc 
aecagacata atagctgaea gaccaaeaga 
caecgtcgga ccatgtgcga acttgatatt 
gaattacaet taaaaegaet caacagctta 
aaaactetca etcteaeoga actcggecgt 
aacaccMac gaatcgaccg attgttaggt 
gtataccgtt ggcatgctag ctttatctgt 
gactggtctg atattcgtga gcaaaaacga 
cacggtcgtt ctgttactct ttatgagaaa 
aaagctcatg accaatttct agiecgacctt 
ctcattgtca gtgatgctgg ccttaaagtg 
tggtactggt taagtcgagt aagaggaaaa 
tggaaaccta tcagcaactt acatgatatg. 
aagaggctga ctaaaagcaa tccaatctca 
aaaggcegaa aaaatcagcg ctcgacacgg 
tactcagcgt cggcaaagga gccatgggtt 
acacccaaac aacbtgttaa tatctattcg 
gacttgaaaa gtcctgccta cggactaggc. 
cgttttgata tcatgctgct aatcgccctg 
gttcatgctc agaaaeaagg ttgggacaag 
aacgtactct caacagttcg cttaggcatg' 
acaagggaag acttactcgt ggctgcaacc 
tacgctttgg ggaaattatg ataatgatcc 
ctctagagat ctgtgtgttg gttttttgtg 
ctgttgtttg cccctccccc gtgccttcct 
tttcctaata aaatgaggaa attgcatcgc! 
ggggcggggt ggggcagcac agcaaggggg 
gggatgcggt gggctctatg ggtacctctc 
tctctcggta ectctctctc tctctctcte 
gtgctgaaga attgaeccgg tgaccaaagg 
aeaattaete agtgeetgtt acaagcagea 
aaaaaetgat ttaaeaaatg gttggtetgc, 
ttatattaee gataataaCa aaaaecttat' 
ttggaatgaa cttgaaaaaa attagccttg 
tcagcaaatt gatceaagag aaccaactta 
tgaceetgag caetgatgaa teecctaatg 
ataeacatct tgtcatatga tcccggtaat 
getttaeact ttatgettee ggctogtatg 
caeacaggaa acagotatga ecatgabtae 
gaacaaaagc tggagcteoa eegcggtgge 
catcagattg getattggcc attgeatacg. 
attggetcat gtccaacate accgceatgt 
taaCcaatta eggggtcatt agttcatagc 
acggtaaatg gcccgcctgg etgacegecc 
acgtatgtte ccatagtaae gccaataggg 
ttacggtaaa etgcecactt ggcagtaeae 
attgaegtea atgacggtaa atggeccgeo 
gactttccta ctt^cagta catctacgta, 
ttttggcagt acatcaaCgg gegtggatag 
caccccattg acgtcaatgg gagtttgttt 
tgtcgtaaca actecgccec abtgacgcaa 
tatataagca gagctcgttt agtgaaeegt 
tttgaTCtcc atagaagaca ccgggaccga 
ggaacgcgga ttccccgtgc caagagtgac 
acccctttgg ctcttatgca tgctatactg 
tccttatgcc ataggtgatg gtatagctta 
gaccactccc ctattggtga cgatactttc 
acaactatct ctattggcta Catgccaata 
gtatttttac aggatggggt cccatttatt 
tcccccgtgc ccgcagtttt tattaaacat 
cgtgttccgg acatgggctc ttctccggta 
cecacgcctc cageggetca Cggtegetcg 
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5341 gcagctcctt gctcctaaca gtggaggcca 
5401 ccagtgtgcc gcacaaggcc gtggcggtag 
5461 gggctcgcac ggctgacgca gatggaagac 
5521 gctgagttgC tgtattctga taagagtcag 
55BX tggagggeag tgtagtctga gcagtactcg 
5641 gctgacagac taacagactg ttcctttcca 
5701 atgggctcca tcggtgcagc aagcatggaa 
5761 gtccaccatg ccaatgagaa catcttccae 
5821 atggtacacc tgggtgcaaa agacagcace 
5881 gataaacttc caggattcgg agaeagtatt 
5941 cactcctcac ttagagacat cctcaaccaa 
6001 agccttgcca gtagacttta tgctgaagag 
6061 tgcgtgaagg aactgtatag aggaggcctg 
6121 caagccagag agctcatcaa ttcctg^ca 
6181 gtecttcagc caagctccgt ggattctcaa 
6241 ttcaaaggae tgtgggagaa agcatttaag 
6301 gtgaetgagc aagaaagcaa acctgtgcag 
6361 gcatcaatgg cttctgagaa aatgaagatc 
6421 agcatgttgg tgctgttgcc tgatgaagtc 
6481 aactttgaaa aactgactga atggaccagt 
6541 tgtacttacc tcgcatgaag atggaggaaa 
6601 tgggcatcac tgacgtgtcc agctcttcag 
6661 gcctgaagat atctcaagce gtccatgcag 
6721 aggtggtagg gtcagcagag gcCggagCgg 
6781 ctgaccatcc attcctcttc tgcatcaagc 
6841 ggcagatgtg tttcccgcgg ccagcagatg 
6901 acgcaccagc. agatgacgca ccagcagatg 
6961 gtggctggat cggcctgctg gatgacgatg 
7021 cacacctggt: ggaagctctc tacctagtgt 
7081 agacccgccg ggaggcagag gacctgcagg 
7141 gtgcaggcag cctgcagccc ttggccctgg 
7201 aacaacgctg taccagcatc tgctccccct 
7261 cctaaagggc gaattatcgc ggccgctcta 
7321 gctaataaaa gatcagagct ctagagatct 
7381 cttctagctg ccagccaCct gtCgtttgcc 
7441 gtgccactcc cactgticctt. tcccaataaa 
7501 ggtgtcattc tattctgggg iggtggggtgg 
7561 acaatagcag gcatgctggg gatgcggtgg 
7621 tctctctcac tetctctctc tctcggtacc 
7681 cgccctatag tgagtcgcat tacgcgcgct 
T741 gggaaaaccc tggcgttacc caacttaatc 
7801 ggcgtaatag cgaagaggcc cgcaccgatc 
7861 gegaatggaa attgtaagcg Ctaatatttt 
7921 cagctcattt tttaaeeaat aggccgaaat 
7981 gaccgagata gggttgagtg ttgttccagt 
8041 ggaetccaac gteaaagggc gaaaaaccgt 
8101 feeatatgaea agatgtgtat ocaccttaae 
8161 ttcatcagtg etcagggtca acgagaatta 
8221 tgtgettaaa aacttaetca atggctggtt 
8281 aaagagcttg cegataaaaa aggccaattt 
8341 ttgaaagata aataaaatag ataggtttta 
8401 atgccctctt gggttatcaa gagggtcatt 
8461 gaaataacta ageacttgtc tcctgttCac 
8S21 tcatcgatag caggataata atacagtaaa 
8581 cccaaattgg tagtgaatga ttataaatw 
8641 ttgcattggt aaggctcacc aataatccct 
8701 ggatagacat cactceetgt aatgcaggta 
8761 aaacagggaa aactaaccaa ccttcagaca 
8821 ctgcaataaa tccgagcagt actgccgttt 
8S81 cacaaaggct tggaatactg agtgtaaaag 
8941 atgctattca tcatcacgat ttctgtaata 
9001 gcgctgaaat aataatcaac aaatggcatc 
9061 ttgctcccct tagcgagggt taattgcgcg 
9121 tgtgtgaaat tgttacccgc ccacaattcc 
9181 taaagcctgg ggtgcctaat gagtgagcta 
9241 cgctctccag tcgggaaacc tgtcgtgcca 
9301 gagaggcggc Ctgcgtattg ggcgctcttc 
9361 ggtcgttcgg ctgcggcgag cggtatcagc 



gacccaggca cagcacaatg cccaccacca 
ggtaCgtgtc tgaaaatgag cgtggagatt 
ttaaggcagc ggcagaagaa gatgcaggca 
aggtaactee egttgcggtg ctgctaacgg 
ttgctgccge gcgegecacc agacataaea 
tgggtctttt ctgcagtcae cgccggatca 
tbttgttttg atgtattcaa ggagctcaaa 
tgccecattg ccatcatgtc agctctagcc 
aggaeaeaaa taaataaggt tgttcgcttt 
gaagcteagt gtggcaeatc tgtaaacgtt 
atcaceaaac caaatgatgt ttattogttc . 
agataceeaa tcctgccsiga atacttgcag 
gaacctatca actttcaaac agctgcagat 
gaaagtcaga caaatggaat tatcagaaat 
actgcaatgg ttetggttaa tgccattgtc 
gatgaagaca cacaagcaat gcctttcaga 
atgatgtacc agattggttt acttagagtg 
ctggagctcc catttgccag tgggacaatg 
tcaggccttg agcagcttga gagtataate 
tctaatgtta tggaagagag, aagatcaaag 
aataeaacct cacatctgtc ttaatggcta . 
ccaatctgtc tggcatcCcc tcagcagaga 
cacatgcaga aatcaatgaa gcaggcagag 
atgctgcaag cgtctctgaa gaatttaggg .. 
acatcgcaac caacgccgtt ctcttctttt 
acgcaccagc agatgacgca ccagcagatg 
acgcaacaac atgtatcctg aaaggctcCt 
acaaatttgt gaaccaacac ctgtgcggct 
gcggggaacg aggcttcttc tacacaccca 
tggggcaggt ggagctgggc gggggccctg, 
aggggtccct gcagaagcgt ggcattgtgg 
accagctgga gaactactgc aactagggcg. 
.gaccaggcgc ctggatccag atcacttctg 
gtgtgttggt tttttgtgga tctgctgtgc 
cctcccccgt gccttccttg accctggaag- 
atgaggaaat tgcatcgcat tgtctgagta 
ggcagcacag caagggggag gattgggaag 
-gctctatggg tacetetctc tetctctctc , 
tctcctcgag gggggsfcecs gtacceaatt 
CBctggccgt cgttttacaa cgtcgtgaet 
gceetgcagc acatccccct ttcgecagct , 
gcocttecea acagttgcgc agcctgaatg 
gttaaaattc gcgttaaatt tttgttaaat 
cggcaaaatc ccttataaat caaaagaata 
ttggaacaag agtccactat taaagaacgt 
ctatcaggge gatggcccac tactocggga 
ttaatgattt ttaeeaaaat cattagggga. 
aeatteegte aggaaagett atgatgatga 
aCgcatatog eaatacatge gaaaaaeeta 
attgctattt aeegoggett tttattgage 
cttgaagcta aatcttettt ategtaaaaa 
atatttcgcg gaataacatc atttggtgac 
tcccctgage ttgaggggtt aacatgaagg 
aogctaaaec aataatccaa atccagecat 
cagcaaaeag taatgggcea ataacacegg 
gtaaagcace ttgetgatga ctctttgttt 
aagcgatccc aecaccagcc aataaaatta 
taaacgctaa aaaggcaaat gcactactat 
tttcgcccat ttagtggcta ttcttcctgc 
accaagaccc gtaatgaaaa gccaaccatc 
gcaccacacc gtgctggatt ggctatcaat 
gttaaataag tgatgtatac cgatcagctt 
cttggcgtaa tcatggtcat agctgtttcc 
acacaacata cgagccggaa gcataaagtg 
actcacatta attgcgttgc gctcactgcc 
gctgcattaa tgaatcggcc aac^cgcggg 
cgcttcctcg ctcactgact ogctgcgctc 
tcactcaaag gcggtaatac ggttatccae 
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9421 agaaccaggg gataacgcag gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa 
M81 cegeaaaaag gccgcgttgc tggcgttttt ccacaggetc egcccccctg acgagcatca 
9541 caaaaatcga cgctcaagcc agaggtggcg aaaeccgaea ggactataaa gataccaggc 
9601 gtttccecct ggaagctecc tcgtgcgctc tcctgttccg accctgccgc ctaccggata 
9661 cctgtccgcc tttctccctt cgggaagogt ggcgctttct catagctcac gctgtaggca 
9721 teteagttcg gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgaac cccccgttca 
9781 geccgacege tgcgecttat ccggtaacta tcgtcttgag tccaacccgg taagacacga 
9841 cttategcea ctggeagcag ccactggtaa caggattagc agagcgaggt atgtaggcgg 
9901 tgetaeagag Ctcttgaagt ggtggcctaa ctaoggctac actagaagga cagtatttgg 
9961 tatctgcget etgctgaagc eagttacctt cggaaaaaga gttggtageb cttgatccgg 
10021 caaacaaacc aecgctggta gcggtggttt ttttgtttge aagcagcaga ttacgcgcag 
10081 aaaaaaagga tctcaagaag atcctttgat cttttctacg gggtctgacg ctcagtggaa 
10141 ogaaaactca cgttaaggga ttttggteat gagattatca aaaaggatct tcacctagat 
10201 ccttttaaat taaaaatgaa gttttaaatc aatctaaagt atatatgagt aaacttggtc 
10261 tgacagttae caatgcttaa tcagtgaggc acctatetca gcgatctgtc tatctcgttc' 
10321 atccatagtc gcctgactec cogtcgtgta gacaactaog acaogggagg getcaccaee 
10381 tggccccagt gctgcaatga tacqgcgaga cccacgctca ceggececag attcatcagc'. 
10441 aataaaccag ecagceggaa gggccgagcg cagaagtggt cetgeaaett taceogeete 
10501 catccagtct attaattgtt gccgggaagc tagagtaagt agecegceag ttaaCagccc 
10S61 gcgcaacgtt gttgccattg ctacaggcat cgtggtgtca cgetegtogt- ttggtatgge 
10621 ttcattcagc tccggttccc aaegatcaag gcgagttaca tgateeceea tgttgtgcaa 
10681 aaaagcggtt agctccttcg gtcctccgat cgttgtcaga agtaagtcgg ccgcagtgtt 
10741 accactcatg gttatggcag - cactgcataa ttctcttact gteatgecat eegtaagatg 
10801 cttttctgtg actggtgagt actcaaccaa gtcattctga gaatagtgta tgcggcgace 
10861 gagttgctct tgcccggcgt caacacggga taataccgcg eeaeatagea gaactttaaa 
10921 agtgctcatc attggaaaac gttcttcggg gcgaaaactc tcaaggatct tacegetgtt. 
10981 gagatccagc tcgatgtaac ccacCcgtgc acccaactga tctteageat ctCttaettt 
11041 caccagcgtt tctgggtgag caaaaacagg aaggcaaaat geegcaaaaa agggaataag 
11101 ggcgacacgg aaatgttgaa tactcatact cctccttttt caatactact gaagcattta 
11161 tcagggttat tgtcccatga gcggacacaC atbcgaacgt atttagaaaa. acaaaeaaat 
11221 aggggttceg ogcacaccte ccegaaaagt gccae 



SEQ ID NO: 44 (pTnMod (Oval/BNT tag/Proins/PA) - QUAIL) 
CTGACGCGCC CTGTAGCGGC GCATTAAGCG CGGCGGGTGT GGTGGTTACG 50 
CGCAG06TGA CCGCTACACT TGCCAGCGCC CTAGCGCCCG CTCCTTTCGC 100 
TTTCTTCCCT TCCTTTCTCG CCACGTTCGC CGGCATCAGA TTGGCTATTG 150 
GCCATTGCAT ACGTTGTATC CATATCATAA TATGIACAXT TATATTGGCT 200 
CATGTCCAAC ATTACCGCCA TGTTGACATT GATTATTQAC TAOTTATTAA 250 
TA6TAATCAA TXAOGGGGTC ATTAGTTCAT AGCCCAXUA T86AGTTC06 300 
C6TTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACC6 CCCAACGACC 350 
CCOG CCCAT T 6ACCTCAATA ATGACQTATQ TTCCCATAOT MCGCC^'Sli 400 
GGGACTTTCC ATT6AC!6TCA A1GG6TGGA6 TATTTACGGT AAACTGCCCA 450 
CTTGGCAeTA CATCAAGTGT ATCATATGCC AAGITiCGCCC CCTATTGACG 500 
TCAAT GACGG TAAATGGCXIC GCCTGGCATT ATCCCCAGTA CATGACCTTA 550 
TGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTATTAC 600 
CaTGOTGATG OGGTTTTGGC AGTACATCAA TGGGCGTGGA TAGCGGTTTG 650 
ACTCACGGGG ATTTCCAAGT CTCCACCCCA TTGACGTCAA TGGGAGTTTG 700 
TrTTGGCACC AAAATCAACG GGACTTTCCA AAATGTCGTA ACAACTCCGC 750 
CCCATTGACG CAAATGGGCG GTAGGCGTGT ACG6TGGGAG GICIATATAA 800 
G CAGAGCT CG TTTAGTGAAC C0TCAGATC6 CCTGGAGA06 CCATCCACGC 850 
'iVriTTGACC TCCATAI6AA6 ACACC6GGAC CGATCCAGCC TCC6CGGCC6 900 
GCSAACGOTQC ATTGGAACSGC GGATTCCCC6 TGCCAAGAGT (S^OGTAAGTA 950 
OOeOCBmS ACTCTATAGGI CACACCCCTT TGGCTCTTAT GCATGCXATA 1000 
CTGTTTTTGQ CTTGGGGCCT ATACACCCCC GCTTCCTTAT GCTATAGGTG 1050 
ATOGTATAGC TTAGCCTATA GGTGTGGGTT ATTGACCATT ATTGACCACT 1100 
CCCCTATTGQ TGACGATACT TTCCATTACT AATCCATAAC ATGGCTCTTT 1150 
GCCACaACTA TCTCTATTGG CTATATGCCA ATACTCTGTC CTTCAGAGAC 1200 
T6ACACGGAC TCT6TATTTT TACAGGATGG GGTCCCATTT ATTATXTACA 1250 
AATTCACATA TACAACAAC6 C06TCCCCCG TGCCCGCA6T TlTCATtAAA 1300 
CATAGCGIGG GATCtCCAOS C6AATCTCG6 GTA06T6TTC CGGACATGQG 1350 
CTCTTCTC06 6XA6CGGCG6 AGCTTCCACA TCCGA6CCCT G8ICCCATSC 1400 
CTCCASOGGC TCATGGTCGC TCGGCAGCTC CTTGCTCCTA ACAGIGGAGG 1450 
CCiAGACmS 6CACAGCACA ATGCCCACCA CCA0CA6TGT GCC6CACAAS 1500 
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GCC6T0CCGG TAGGGTATGT GTCTGflAAAT GAGCGTGGAG ATTGGGCTCG 1550 
CAOQGCn»C 6CAGAT6GAA GACTTAAGGC A6CGGCAGAA GAAGATGCAG 1600 
6CAGCT6A6T TGTT6TATTC TGATAAGA6T CA6AG6TAAC TCCCGTTGCG 1650 
GTOCTOTTAA CGGTGGAGGG CAGTGTRGTC TGAGCAGTAC TCGTTGCTGC 1700 
CGCGCX3CGCC ACCAGACATA ATAGCTGACA GACTAACAGA CTGTTCCTTT 1750 
CCATGGGTCT TTTCTGCAGT CACCGTCGGA CCATGTGTGA ACTTGATATT 1800 
TTACATGATT CTCTTTACCA ATTCTGCCCC GAATTACACT TAAAACQACT 1850 
CAACAGCTTA ACGTTGGCTT GCCACGCATT ACTTGACTGT AAAACTCTCA 1900 
CTCTTACCGA ACTTGGCCGT AACCTGCCAA CCAAAGCGAG AACAAAACAT 1950 
AACATCAAAC GAATCGACC6 ATTGmGGT AATCQTCACC TCCACAAAGA 2000 
GCGACTOSCT GTATAC08TT 6GCA3GCTA6 CTTTATCTSr TOGGGAATAC 2050 
GATGCCCATT GTACTTQTTG ACTGGTCTCA TATTCGTGAa CAAAAACXSAC 2100 
TTATGGXATT GOGAGCTTCA GTCGCACTAC ACGGTCGTTC TOTTACTCTT 2150 
TATCAQAAAG C6TTCC0GCT TTCA8AGCAA TGTTCAAA6A AAGCTCATGA 2200 
CCAATTTCTA GCCGACCTTG CGA6CATTCT ACCGAGTAAC ACCACACCGC 2250 
TCATTGTCAQ TCATGCTGGC TTTAAAGTGC CATGGTATAA ATCCGTTGAQ 2300 
AAGCTGGGTT G6TACTGGTT AAGTCGAGTA AGAGGAAAAQ TACAATATGC 2350 
AGACCTAGGA GCGGAAAACT GGAAACCTAT CAGCAACTTA CATGATATGT 2400 
CATCTAGTCA CTCAAAGACT TTAGGCTATA AGAGGCTQAC TAAAAQCAAT 2450 
CCAATCTCAT GCCAAATTCT ATTGTATAAA TCTCGCTCTA AAOGCOGAAA 2500 
AAATCAGCGC TCGACACGGA CTCATTQTCA COUCCCGICA CCIAAAATCT 25S0 
ACTCAGC9QTC GGCAAA66A6 CCATGQGTTC TAGCAACTAA CTTACCTOTT 2600 
OAAATTOGAA .CACCQAAACA ACTTGTTAAT ATCTATTOGA AGCGAATGCA 2650 
6ATTGAAGAA ACCTTCCGAG ACTTGAAAAG TCCTGCCTAC G6ACTIAG6CC 2700 
TACGCCATA6 CCGAACGAGC A6CTCA6AGC jSTTrTQATAT CATGCTGCTA 2750 
ATCGCCCT8A TGCTTCAACT AACATGTTGG CTTGCGGGCG TTCATGCTCA 2809 
GAAAC3\A66T TGGGACAAGC ACTTCCAGGC TAACACAGTC AGAAATCSAA 2850 
ACGTACTCTC AACAGTTC6C TTAGGCATGG AAGTTTTGCO GCATTCTGGC 2900 
TACACAATAA CAAGGGAAGA CTTACTCGTG GCTGCAACCC TACTAGCTCA 2950 
AAATTTATTC ACACATGGTT ACGCTTTGGG GAAATTATGA TAATGATCCA 3000 
GATCaCTTCT GGCTAATAAA AGATCAGAGC TCTAGAGATC TGTGTGTTGG 3050 
TTTTTTGTGG ATCTGCTGTQ CCTTCTAGTT 6CCAGCCATC TGTTGTTTGC 3100 
CCCTCCCCCG TGCCTTCCTT GACCCTGGAA GGTGCCACTC CCACT6TCCT- 3150 
TTCCTAATAA AATGAGGAAA TTGCAT06CA TTGTCT6AQT AGGTGTCATr 3200 
CTAirCTGGG' GGGtOGGOTG GGGCA6CACA GCAAGGGGGA GGATTGGGAA 3250 
6ACAATAGCA G6CATQCTGG GGATGC6GTG GGCTdATGG GTACCTCTCT 3300 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CTCTCGGIAC CTCTCTCTCT 3350 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT C6GTACCAG6 TGCT6AAGAA 3400 
TTGACCCGGT 6ACCAAAGGT GCCTTTTATC ATCACTTTAA AAATAAAAAA 3450 
CAATTACTCA 6TGCCTGTTA TAAGCAGCAA TTAATTATGA TTGATGCCTA 3500 
CATCACAACA AAAACTGATT TAACAAATGG TTGGTCTGCC TTAGAAAGTA 3550 
TATTTGAACA TTATCTTGAT TATATTATTG ATAATAATAA AAACCTTATC 3600 
CCTATCCAAG AAGTGATGCC TATCATTGGT TGGAATGAAC TTGAAAAAAA 3650 
TTAGCCTTGA ATACATTACT GGTAAGGTAA ACGCCATTGT CA6CAAATT6 3700 
ATCCAAGAGA ACCAACTTAA AGCTTTCCTG ACG6AATGIT AATTCTC6TT 3750 
QACCCTGAGC ACTQATGAAT CCCCTAATGA TTTTOOTAAA AATCATTAA6 3S00 
TTAAGeXGGA TACACATCTT GTCATATQAT CC06GTAAT6 TOAGTTAGCT 3850 
CACTCAT TAG 6CACCCCAG6 CTTTACACTT TATGCTTCCG GCTCGTATGT 3900 
TGTGTGGAAT TGTSA6CGGA TAACAATTTC ACACAGGAAA CAGCTATGAC 3950 
CATGATTACG CCAAGCGCGC AATTAACCCT CACTAAAGGG AACAAAAGCT 4000 
GGAGCTCCAC CGCGGTQSOG GCCGCTCTAG AACTAGTGGA TCCCCCGGGG 4050 
AGGTCAQAAT GGTTTCTTTA CTGTTTGTCA ATTCTATTAT TTCAATACA6 4100 
AACAAAAGCT TCTATAACTG AAATATATTT GCTATTGTAT ATTATGATTG 4150 
TCCCTCGAAC CATGAACACT CCTCCAGCTG AATTTCACAA TTCCTCTGTC 4200 
ATCTGCCA6G CTGGAAGATC ATGGAAGATC TCTGAGGAAC ATT6CAA6TT 4250 
CATACCATAA ACTCATTTGe AATT6AGTAT TATTTT6CTT TSAAT66AGC 4300 
TAT6TTTTGC AGTTCCCTCA GAAGAAAAQC TT6TTATAAA 6CGTCTACAC 4350 
CCATCAAAM ATATATTTAA ATATTCCAAC TACA6AAASA TTTTGfTCTGC 4400 
TCTTCACTCT GATCTCAGTT GGTTTCTTCA CGTACATGCT TCTTTATTTG 4450 
CCTATTTTGT CAAQAAAATA ATAGGTCAAG TCCTGTICTC ACTTATCTCC 4500 
TGCCTAGCAT GGCTTAGKI6 CAC6TTGTAC ATTCAA6AAG GATCAAATGA 4550 
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AACAGACTTC TGGTCTGTTA CAACAACCAT AGTAATAAAC AGACTAACTA 4600 
ATAATTGCXA ATTAIGTrrT CCATCTCTAA GGTTCCCACUV TrTTTCTGTT 4650 
TTAAGKTCCC AT TATC TCCT TGTAACTGAA GCTCAATGOA ACATGAACAG 4700 
TATTTCTCAG TCTTTTCTCC A6CAATCCT8 ACG6ATZAGA AGAACTGGCA 4750 
6AAAACACTT T6TTACCCAG AATTAAAAAC TAATATXTGC TCTCCCTTCA 4800 
ATCCAAAAKJ GACCTATTGA AACTAAAATC TGACCCAATC CCATTAAATT 4850 
ATTTCTATC3G CGTCAAAGGT CAAACTTTTO AAGGGAACCT GTGGGTGGGT 4900 
CCCAATTCAG GCTATATATT CCCCAGGGCT CAGCCAQTGG ATCCATGGGC 4950 
TCCATCGGTG CAGCAAGCAT GGAATTTTGT TTTGATGTAT TCAAQGAGCT SOOO 
CAAA6TCCAC CATGCCAATG ACAACATGCT CTACTCCCCC TTTGCCATCT 5050 
TGTCAACTCT GGCCATGGTC TTCCTAGGTG CAAAAGACA6 CACCAOGACC 5100 
CAOAXAAAXA AGGTraTTCA CTTTSATAAA CTTCCAGGAT TCCXSAGACA6 5150 
TATTGAAGCT CA6TSIGGCA CATCTGTAAA TSTTCACTCT TCACTTAOAG 5200 
ACATACTCAA CCAAATCACC AAACAAAAT6 ATGCTTATTC CrTCAGCCTT 5250 
GCCAGTAQAC TTTATGCTCA AGA6ACATAC ACAGTCGTGC CGGAATACTT 5300 
GCAATGTGTQ AAGGAACTGT ATAGAGGAGG CTTAGAATCC GTCAACTTTC 5350 
AAACAGCTQC AGATCAAGCC AGAGGCCTCA TCAATGCCTG GGTAGAAAQT 5400 
CAGACAAAC6 GAATTATCAG AAACATCCTT CAGCCAAGCT CCGTGGATTC 5450 
TCAAACTGCA ATGGTCCTGG TTAATGCCAT TGCCTTCAAG GGACTGTGGG 5500 
AGAAAGCATT TAAGGCTGAA GACAC6CAAA CAATACCTTT CAGAGTGACT 5550 
GAGCAAGAAA 6CAAACCTGT GCAGAT6ATG TACCAGATFG GTTCATTTAA 5600 
AGTGGCATCA ATGGCTTCT6 ASAAATGAA '6ATCCTGGAG CTTCCATTT6 5650 
CCAGTQGAAC AATGAGCATG TTGGTGCTGT TGCCT6ATGA TOTCTCAGC3C 5700 
CTTQAGCASC TT6A6AOXAT AATCAGCTTT . GAAAAACTGA CICSAATGGAC 57S0 
CAGTTCZAlST ATTA3GGAA6 AGAGGAAGST CAAAGTeZAC TTACCTCGCA 5800 
TCAAGJVTGQA 6GAGAAATAC AACCTCACAT CTCTCTTAAT GGCTAtGGGA 5S50 
ATTACTGACC TOTTCAOCTC TTCAGCCAAT CTGTCTG6CA TCTCCTCAGT 5900 
AGGGAGCCTG AAGAIATCTC AAGCTGTCCA TGCAGCACAT GCAGAAATCA 5950 
ATGAAGCGGG CAGAGATGTG GTAGGCTCAQ CAGAGGCTGG AGTGGATGCT 6000 
ACTGAAGAAT TTAGGGCTGA CCATCCATTC CTCTTCTGTG TCAAGCACAT 6050 
CGAAACCAAC GCCATTCTCC TCTTTGGCA6. ATGTGTTTCT CCGCGGCCAG 6100 
CAGATGACGC ACCAGCAGAT QACGCACCAG CAGATGACGC ACCAGCAGAT 6150 
OACQCACCAG CAGATGACGG ACCAGCAGAT GACQCAACAA CATGTATCCT 6200 
GAAAGGCTCT TGTGGCTGGA TCGGCCTGCT GGATGACGAT GACAAATTT6 6250 
T6AACCAACA CCCGIGCG6C TCACACCT66 TGQAAGCTCT CTACCCA6I6 6300- 
TGOGGOQAAC GAGGCTTCTT CTACACACCC AAGACCCGCC GGGASGCAGA 6350 
GGACCTGC3U3 GrGGGGCAQQ TGGAGCIGGG CGGG6GCCCT GGTGCAOGCA 6400 
GCCIQCAGCC CTTG6CCCIQ 6AG6GGICCC TGCAGAAGOa IGGCATTQTG 6450 
6AACAAT6CT.0TACCAGCAT CTGCTCCCTC TACCAGCTGG AGAACTACTG 6500 
CAACTAGGGC GCCTGGATCC A6ATCACTTC TGGCTAATAA AAGATCAGA6 6550. 
CTCTAGAGAT CTGTGTGTTG GTTTTTTGTG GATCTGCTGT GCCTTCTAGT 6600 
TGCCAGCCAT CTGTTGTTTG CCCCTCCCCC GTGCCTTCCT TGACCCTGGA 6650 
AGGTGCCACT CCCACTGTCC TTTCCTAATA AAATGAGGAA ATTGCATCGC 6700 
ATTGTCTGAG TAGGTGTCAT TCTATTCTGG GGGGTGGGGT GGGGCA6CAC 6750 
AGCAAGGGGG AGGATTGGGA AGACAATAGC AGGCATGCTG GGGATGCG6T 6800 
G6GCICXAIG G6TACCTCTC TCTCTCTCTC TCTCICXCTC TCTCTCTCTC 6850 
TCTCTOGGXA CCTCTCTCGA GGGGGGQCCC GQIACOCAAT TCGCCCTAIA 6900 
6T8A8XG8XA TTAOGCQCQC TCACTGGCCG TC gmTA CA ACGTCGTGAC 6950 
TQG6AAAACC CTGGCSTTAC CCAACTTAAT CGCCTTGCAG CACATCCCCC 7000 
TTTC6CCAGC T6G06TAATA GCGAA6AGGC CCGCACCGAT CGCCCTTCCC 7050 
AACAGTT6C6 CAGCCTGAAT GGCGAATGGA AATTGTAAGC GTTAATATTT 7100 
TGTTAAAATT CGCGTXAAAT TTTTGTTAAA TCAGCTCATT TTTTAACCAA 7150 
TAG6CCQAAA TCG6CAAAAT CCCTTATAAA TCAAAAQAAT AGACCGAGAT 7200 
AGGGTTGAGT GTTGTTCCAG TTTGGAACAA GAGTCCACTA TTAAAGAACG 7250 
TGGACTCCAA CGTCAAAGGG CGAAAAACCG TCTATCAGG6 CGATGGCCCA 7300 
CTACTCCGGG ATCATATGAC AAGATGTGTA TCCACCTIAA CTEAATGATT 7350 
TTTACCAAAA TCATTAGG66 ATTCATCAGT GCTCAGGeTC AA06ASAAIT 7400 
AACATTCOQT CSGGAAAGCT TAT6AT6AT6 ATGT6CTTAA AAACTTACTC 7450 
AATGGCTGGT ZAT6CATATC GCAATACAT6 OGAAAAACXTT AAAAGAGCTT 7500 
GCCX3AXAAAA AAGGCCAATT TATTGCTATT TACCGCGGCT TTTTATT6AG 7550 
CTTQAAA6AT AAATAAAATA GATAGCTTTT ATTT6AA6CT AAATCTTCTT 7600 
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TATCCSTAAAA AATCCCCTCT TGGGTTATCA AGAGGGTCAT TATATTTCGC 76S0 
GGAATAACAT CATTTGGTGA CGAAATAACT AAGCACTTGT CTCCTGTTTA 7700 
CTCCCCTGAG CTTGAGGGGT TAACATGAAG GTCATCGATA GCAGGATAAT 7750 
AATACAGTAA AACGCTAAAC CAATAATCCA AATCCAGCCA TCCCAAATTG 7800 
GTAGTGAATO ATTATAAATA ACAGCAAACA 6TAATGG6CC AATAACACC6 7850 
GTTGCATraG TAAGGCTCAC CAATAATCCC T6TAAACCAC CTT6CTGAT0 7900 
ACTCTTTGTT TGGATAGACA TCACTCCCre TAATGCAGGT AAAOOQATCC 7950 
CACCAOCAGC CAATAAAA.TT AAAACAGG6A AAACTAACCA ACCTTCAGAT 8000 
ATAAACGCTA AAAAGGCAAA TGCACTACTA TCTGCAATAA ATCCGAGCAG 80S0 
TACTGCCGTT TTTTCGCCCC ATTTAGTGGC TATTCTTCCT GCCACAAA6G 8100 
CTTGGAATAC TGAGTGTAAA AGACCAAGAC CC6CTAATGA AAAGCCAACC 8150 
ATCATGCTAT TCCATCCAAA ACGATTTTCG GTAAATAGCA CCCACACCGT 8200 
TGCGGGAATT TGGCCTATCA ATTGOGCTGA AAAATAAATA ATCAACAAAA 8250 
TGGCATOGTT TXAAATAAAG IGATGTATAC CGAATTCAGC TTTTGTTCCC 8300 
TTTAGTGA08 GTTAATTGC6 OGCTtOGCGT AATCAT66TC ATAGCT6TTT 8350 
CCTGTGTGAA ATT6TTATCC GCTCACAATT CCACACAACA TACQAGCCGO 8400 
AA6CATAAAG TGTAAAGCCT GGGGTGCCTA ATQAGT6A6C TAACTCACAT 8450 
TAATTG06TT COGCTCACTG CCCGCTTTCC AGTCQG6AAA CCTGT CGTGC 8500 
CAGCTQCATT AATGAATOSG CCAA06C6C6 G06A8AG6CQ GTTPGCGIAT 8550 
TGGGCGCTCT TCCGCTTCCT CGCTCACTGA CTCGCTGCGC TCGGTCGTTC 8600 
GGCTGCGGCG AGCiSGTATCA 6CTCACTCAA AG6CGGTAAT ACGGTTATCC 8650 
ACAGAATCAG GGGATAACOC AGGAAAGAAC ATGTGAGCAA AAGGCCAGCA 8700 
AAAGGCCAGG AACCGTAAAA AGGCCX^GTT GCTGGCGTrT TTCCATAGGC 8750 
TCC6CCCCCC TGACGAGCAT CACAAAAATC GACGCTCAA6 TCAQAGGTGG 8800 
CGAAACCCGA CAGGACTATA AAGATACCAG GCGTrTCCCC CTGGAAGCTC 8850 
CCTCGTGCGC TCTCCTGTTC CGACCCTGCC GCTCACCGQA XACCTGTCC6 8900 
CCTiraCTCCC TTCGGGAAGC QTGGCGCTTT CTCATA6CTC AC6CT6TAG0 8950 
TATCTCAGTT CGGTQTAGGT C6TTCGCTCC AAGCTGGGCT GI6T6CA06A 9000 
ACCCCCCGTT CAGCCCCACC GCT6CGCCTT ATCCGGTAAC TATCQTCTTO 9050 
AGTCCAACCC QGTAAGACAC GACTTATOGC CACT6GCAGC AGCCACTGGT 9100 
AACAGGATTA GCAGAGCGAG GTATGTAG6C G6TGCTACAG AGITCTTGAA 9150 
GTGGXGGCCT AACTACGGCT ACACTAGAAG GACAGTATTT GGTATCTGCG 9200 
CTCTQCTQAA GCCAGTtACC TTCGGAAAAA QAGTTGGTAG CTCTTGATCC 9250 
GGCAAACAAA CCACCGCTGG TA6CGGTGGT TTTTrrGTTT GCAAGCAGCA 9300 
GATTACGCGC AGAAAAAAAG GATCTCAAGA AGATCCTTTG ATCTITICTA 9350 
CGGGGTCTGA CGCTCAGTG6 AACGAAAACT CACGTTAAGO GATTTTGGTC 9400 
ATQAGATTAT CAAAAAGGAT CTXCACCTAG ATCCmTAA ATTAAAAATG 9450 
AAGTrXTAAA TCAATCXAAA GTATATATGA GTAAACITaG OICrQACAGTT 9500 
ACCAATGCIT AATCAfiTQAG 6CACCTATCT CAGC8ATCT6 TCTATTTC6T 9550 
TCATCCATA6 TTGCCTGACT CCCC6TCGT6 TAGATAACTA CGATACGQ6A 9600 
GGGCTTACCA TCTGGCCCCA GTGCTGCAAT QATACCGCQA QACCCA06CT 9650 
CACCQ6CTCC AGATTTATCA 6CAATAAACC A6CCAGCCGQ AA6GGCCGAG 9700 
COCAGAAOIG GTCCTGCAAC TTTATCCGCC TCCATCCAGT CTATTAATTO 9750 
TTGCCGGOAA GCTAGAGTAA GTAGTTCGCC AGTrAATAGT TTGCGCAACO 9800 
TTGTTGCCAT TGCTACAGGC ATCGTGCTGT CACGCTCGTC GTTTGGTAT6 9850 
GCrrCATTCA GCTCCGGTTC CCAACGATCA AGGCGAGTTA CATGATCCCC 9900 
CATGTTGTGC AAAAAAGCGG TTAGCTCCTT CGGTCCTCCQ ATCGTT6TCA 9950 
GAAGTAAGTT GGCCX3CAGTG TTATCACTCA TGGTTATGGC AGCACTGCAT 10000 
AATTCTCTTA CTGTCAT6CC ATCCGTAAOIV TQCTTTrCIG TGACTGGTGA 10050 
GCACICAAOC AAOTCATTCT GAGAATA6T6 TATG0GG08A CCXSAGTTGCT 10100 
CTXQOOOGGC GTCAATAOGG GATAATACC8 CX3CCACATA0 CA6AACTTTA 10150 
AAAGTSCTCA TCATTGGAAA AOSTTCTTCS Q680GAAAAC TCTCAA6SAT 10200 
CTTACOGCTG TTGAGATCCA GTTCGATGTA ACCCACTCGT isCACCCAACT 10250 
GATCTTCAGC ATCTTTTACT TTCACCAGCG TTTCTGGGTG AGCAAAAACA 10300 
GGAAGGCAAA ATGCCGCAAA AAAGGGAATA AGGGCGACAC GGAAATGTTG 10350 
AATACTCATA CTCTTCCTTT TTCAATATTA TTGAAGCATT TATCAGGGTT 10400 
ATTGTCTCAT GAGCGGATAC ATATTTGAAT QTATTTAGAA AAATAAACAA 10450 
ATAGGGGTTC CGCGCACATT TCCCCGAAAA GTGCCAC 10487 
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CTGAiCGCGCC CTGTAGOSGC GCATTAAGCG CGGC6GGTGT CX3TOGTTAC6 SO 
CGCAGCSTQH CCGCTACACT TGCCAGCGCC CTAGCGCCCQ CTCCTTTCGC 100 
TTTCTTCCCT TCCTTTCTC6 CCJVCGTTCGC CGGCATCAGA TTGGCTATTQ ISO 
GCCATTQCAT ACGTT6TATC CATATCATAA TATSTACATT TATATTG6CT 200 
CAT5TCCAAC ATTACCGCCA TGTTGACATT GATTATTGAC TAGTTATTAA 250 
TAGTAATCAA TTACGGGGTC ATTAGTTCAT AGCCCATATA TGGAGTTCCG 300 
CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACGACC 350 
CCCGCCCATT GACGTCAATA ATGACQTATQ TTCCCATAGT AACGCCAATA 400 
GGGACTTTCC ATTGACCTCA ATGGGTGGAG TATTTACGGT AAACT6CCCA 450 
CTTGGCAGTA CATCAAGTGT ATCATATGCC AAGTACGCCC CCTATT6ACG 500 
TCAATQACGG TAAATGGCCC GCCTGGC&TT ATGCCCAGTA CATGACCTTA 550 
TQGQACTTTC CTACTTG6CA GTACATCTAC GTATTAGTCA TCGCTATTAC fiOO 
CArrGCTGATG OGGSTTTTGGC AGTACATCAA TGGGCGTGGA TAGCG6TTT6 650 
ACTC AOGGGG ATTTOCAAGT CTCCACCCCA TT6A0GTCAA TGGGAOXTTB 700 
TTTTGGCACC AAAATCAAC6 GGACTTTCCA AAATGTCQTA AC!AACTC06C 750. 
CCCATTGACQ CAAATGGGCG GTAGGCGTGT ACGGTGGGAG GTCTATATAA 800 
GCAGAGCTCG TTTAGTGAAC CGTCAGATCG CCTGGAGACQ CCATCCACGC B50 
TGTTTTGACC TCCATAGAAG ACACCGGGAC CGATCCAGCC TCC6CGGCCG 900 
GGAACGOTGC ATTGGAACGC GGATTCCCCG TGCCAAGAGT GAC6TAAGTA 950 
CCGCCTATAG ACTCTATAGG CACACCCCTT TGGCTCTTAT GCATGCTATA 1000 
CTGTTTTTGG CTTGGGGCCT ATACACCCCC GCTTCCTTAT GCTATAGGTG 1050 
ATGGTATAGC TTAGCCTATA GGTGTGGGTT ATTGACCATT ATTGACCACT 1100 
CCCCTATTGG TGACGATACT TTCCATTACT AATCCATAAC AT66CTCTTT IISO 
GCC3VCAACTA -TCTCIATTGO CTATATGCCA ATACTCTGTC CTTCAGAGAC 1200 
IQACaVCGQAC JTCTOTATTrr TACAGQATOO GGXCCCATTT AURXTTACA 1250- 
AATTCACATA TACAACAACO C0GTCCCCO6 TGCC06C3U3T TTTTAriAAA 1300 
C3VTAQ0GXG6 -GATCXCCACG C6AATCT0S6 GTAGQT&TTC CGtSUCRTSGG 1350 
CTCTTCTCCG GTA60GG0Q6 A6CTTCCACA TCCGAGCCCT G6TCCCATGC 1400 
CTCCAGCQGC TCATGGTCGC TCGGCAGCTC CTTGCTCCTA ACAQTGGAGG 1450 
CCAQACTTAG GCACAGCACA ATGCCCACCA CCACCAGTCT GCCQCACAAG 1500 
GCCGTGQCGG TAGGOTATGT GTCTGAAAAT GAGCGTGGA6 ATTGGGCT06 1550 
CACGGCT6AC GCAGATGGAA GACTTAAGGC AGCGGCAGAA GAAGATGCA6 ISOO 
GCAGCTGAGT TGTTGTATTC TQATAAGAGT CAQAGGTAAC TCCCGTTGCG 1650 
QTGCTSTTAA CGGTGGA6GG CAGT6TAGTC TGAGCA6XAC TCGTTGCTGC 1700 
CGCGCGCGCC ACCAGACATA ATAGCTGACA GACTAACaGA CTGTTCCTTT 17S0- 
CCATGGSrCT TT TCTG CftOr CACCGT066A CCATQTG1GA ACTTGATATT 1800 
TTACATGATT CTCTTXACCA ATTCIGCCCC GAATTACACT lAAAACGACT 1850 
CAACAGCTIA A0GTTG6CTT GCCAC6CATT ACTTQACTCT AAAACTCTCA 1900 
CTCTTACCGA ACTTGGCCGT AACCTGCCAA CCAAAOCQAQ AACAAAACAT 1950 
AACA TCAA AC GAATCGACC6 ATTGTTAGGT AATCGTCACC TCCACAAAGA 2000 
GCGACTCGCT OTATACCGTT GGCATGCTAG CTTTATCrGT TCGGGAATAC 2050 
GATGCCCATT GTACTTGTTG ACTGGTCTGA TATTCGTGAG CAAAAACGAC 2100 
TTATGGTATT GCGAGCTTCA GTCGCACTAC ACGGTCGTTC TGTTACTCTT 21S0 
TATGAGAAAG CGTTCCaSCT TTCAGAGCAA TGTTCAAAGA AAGCTCATGA 2200 
CCAATTTCTA GCCGACCTTG CGAGCATTCT ACCGAGTAAC ACCACACCGC 2250 
TCATTGTCAO TGATGCTGGC TTTAAAGTGC CATGGTATAA ATCC6TTGAG 2300 
AAGCTGGGTT GGIACTG6TT AA6TC6A0TA AGAG6AAAA6 TACAATAXGC 2350 
AGACCIAGGA GC68AAAACT G6AAACCTAT CAGCAACXTA CATGATATST 2400 
CATCTAGTCA CTCAAAGACT TTAGGCTATA AGAOGCTGAC 7AAAAGCAAT 24S0 
CCAATCZCAT GCCAAATTCT ATTGTATAAA TCTOGCTCTA AAGGCCQAAA 2500 
AAATCAGOGC TOSACACGGA CTCATTGTCA CCACCOTTCA CCTAAAATCT 2550 
ACTCAGCeiC GGCAAAGGAG CCATGGGTTC TAGCAACTAA CTTACCTGTT 2600 
GAAATTCGAA CACCCAAACA ACTTGTTAAT ATCTATTCBA AGCGAATGCA 2C50 
GATTGAAGAA ACCTTCC6AG ACTTGAAAAG TCCTGCCTAC GQACTAGGCC 2700 
TAC6CCATAS CCGAACGAGC AGCTCAGAGC GTrTTGATAT CATGCTGCTA 2750 
ATCX3CCCTGA TGCTTCAACT AACATQTTGG CTTGCGGGCG TTCATGCTCA 2800 
GAAACAAG6T TGGGACAAGC ACTTCCAGGC TAACACAGTC AGAAATCCSAA 2850 
ACGXACTCTC AACAGTTCGC TXAGGCATGG AAGTTTTCC6 GCATTCTGGC 2900 
TACACAATAA CAAGGGaASA CTTACTCGTG GCTGCAACCC TACTAGCTCA 2950 
AAATTTATTC ACACATGOTT ACGCTTTSGG GAAATTATGA TAATGATCCA 3000 
6ATCACTTCT GGCTAAT3UVA AGATCAGA6C TCTAGAGATC TGTGTGTTGG 3050 
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TTTTTTGTaS ATCTGCTGTG CCTTCTAGTT GCCAGCCATC TGTTGTTTGC 3100 
CCCTCCCCCG TQCCTTCCTT GACCCTGGAA GGTGCCACTC CCACTGTCCT 3150 
TTCCTAATAA AATGAGGAAA TTGCATCGCA TTGTCTGAGT AGGTGTCATT 3200 
CTATTCTGGG GGGTGGGGTQ GGGCAGCACA 6CAAGGGGGA GGATTGGGAA 3250 
GACAAXAGCft GGCATGCTGQ G6ATGCG6T6 GGCTCTATGG 6TACCTCTCT 3300 
CTCTCTCTCT CTCTCTCTCI CTCTCTCTCT CTCTCGGTAC CTCTCTCTCT 3350 
CTCTCTCTCr CTCTCTCTCT CTCTCTCTCT CGGT ACCAG G TGCTQRAQAA 3400 
TT6ACCC6GT GACCAAA6GT GCCTTTTATC ATCACTTTAA AAATAAAAAA 3450 
CAATTACTCA GTGCCTGTTA TAAGCAGCAA TTAATTATGA TTGATSCCTA 3500 
CATCACAACA AAAACTGATT TAACAAATGC TTGGTCTCCC TTAGAAAGTA 3550 
TATTTGRACA TTATCTTQAT TATATTATTG ATAATAATRA AAACCTTATC 3600 
CCTATCCAAQ AAGTGATGCC TATCATTGGT TGGAATGAAC TTGAAAAAAA 3«S0 
TTAGCCTTGA ATACATTACT GGTAAGGTAA ACGCCATTGT CAQCAAATTG 3700 
ATCCAAGAGA ACCAACTTAA AGCTTTCCTG ACGGAATGTT AATTCTCGTT 3750 
6ACCCTGAGC ACTGATGAAT CCCCTAATGA TTTTGGTAAA AATCATTAAG 3800 
TTAAGOXGCA TACACATCTT GTCATATGAT CCCGGTAATG TGAGTTAGCT 3850 
CACTCATTAG GCACCCCAGG CTTTACACTT TATGCTTCCQ GCTCGTATGT 3900 
TQTOIGQMT TGXGASCGGA TAACAATTTC ACACAGGAAA CAGCTAT6AC 3950 
CAT6ATTACG CCAAGCGCGC AATTAACCCT CACTAAAGGQ AACAAAAGCT 4000 
GGAQCTCCAC CGC66TGGCG GCCGCTCTAG AACTAGTGQA TCCCCCGGQO 4050 
AGGTCAGAAT GGTTTCTTTA CTGTTTBTCA ATTCTATTAT TTCAATACAO -4100 
AACAATAGCT TCTATAACTG AAATATATTT GCXATTGTAT ATIAT6ATT6 4150 
TCCCTCGAAC CATGAACACT CCTCCAGCTG AATTTCACAA TTCCTCreTC 4200 
ATCTGCCAGG CCATTAAGTT ATTCATGGAA GATCTTTGAQ GAACACTGCA 4250 
AGTTCATATC ATAAACACAT TTGAAATTGA GTATTG^nTT GCATTGTATG 4300 
GAGCIATGTT TTGCTGTATC CTCAtSAAAAA AAC3TTTQTTA TAAAGCATTC 4350 
ACACCCATAA AAAGATAGAT TTAAATATTC CAGCTATAGG AAAOAAAGTG 4400 
CGTCTGCTCT TCACTCTAGT CTCAGTTGGC TCCTTCACAT GCATGCTTCT 4450 
TTATTTCTCC TATTTTGTCA AGAAAATAAT AGOTCACGTC TTGTTCTCAC 4500 
TTATGTCCTG CCIAGCATGG- CTCAGATGCA' CGTTGTACAT ACAAGAAGGA 4550 ' 
TCMATQAAA CAQACTTCTG CTCTGTTACT ACAACCATAO TAATAAGCAC 4600 
ACTAACTAAT -AATT6CT2UVI TAT6TTTTCC ATCTCTAAGQ TTCCCACATT 4650 
'I' T T C J G 'l'TTT CTTAAAGATC CCATTATCTG 6TTGTAACTG AAGCTCAATG 4700 
GAACAT6AGC AATATTTCCC AGTCTTCTCT CCCATCCAAC A6TCCTGATG 4750 
GATTA6CA6A ACAGGCA6AA AACACATTGT TACCCAGAAT TAAAAACTAA 4800. 
TATTTGCTCT CCATTCAATC CAAAATGGAC CTATTQRAAC TAAAATCTAA 4850 
CCCAATCCCA TTAAATGATT TCTATGGCGT CAAAGGTCAA ACTTCTQAAG 4900 
GGAACCTGTG GGTGGGTCAC AATTCAGGCT ATATATTCCC CAGGGCTCAG 4950 
CGGATCCAT6 GGCTCCATCG GCGCAGCAAG CATGGAATTT TGTTTXGATG 5000 
TATTCAAGGA GCTCAAAGTC CACCATGCCA ATGAGAACAT CTTCTACTGC 5050 
CCCATTQCCA TCATGTCAGC TCTAGCCATG GTATACCTGG GTGCAAAAGA 5100 
CA6CACCA6G ACACM3ATAA ATAAGGTT6T TCGCTTTGAT AAACTTCCAG 5150 
GATTCGGAGA CAGIWXTGAA GCTCAGIGTG GCACATCTGT AAAC6TTCAC 5200 
TCTTCACTTA GAfiACATCCT CAACCAAATC ACCAAACCAA ATGATGTTTA 5250 
TTC8TTCAGC CTTQCCAGXA GACTTTATGC TGAAGAGAGA TACCCAATCC 5300 
TQCCAGAATA CTT6CAGTGT GIGAAGGAAC TGTATAQAGG AGGCTTGOAA 53S0 
CCTATCAACT TTCAAACAGC TGCAGATCAA 6CCAGAGAGC TCATCAATTC 5400 
CTGGGTAGAA AGTCAGACAA ATGGAATTAT CAGAAATGTC CTTCAGCCAA 5450 
GCTCCGTGGA TTCTCAAACT GCAATGGTTC TGGTTAATGC CATTGTCTTC 5500 
AAAGGACTGT GGGAGAAAAC ATTTAAGGAT GAAQACACAC AAGCAATGCC 5550 
TTTCAGAGTG ACTGAGCAAG AAAGCAAACC TGT6CAGAT6 ATGTACCAQA 5600 
TTGGTTTATT XAOAGTGGCA TCAATGGCTT CTGASAAAAT GAAGATCCTG 5650 
GAGCTTCCAT TTGCCAOrGQ GACAATGA6C ATGTTGGIGC TGTTGCCTGA 5700 
TQAAGTCTCA GGCCHCAfiC AGCTTGAGAS TATAATCAAC TTT6AAAAAC 5750 
TGACTGAATO GACCAGTTCT AATGTTATGG AAQAGAG6AA GATCAAA6IG 5800 
TACTTACCTC GCATGAAGAT GGAGGAAAAA TACAACCTCA CATCT6TCTT 5850 
AATGGCXATG GGCATTACTG AC6TGTTTA6 CTCTTCAGCC AATCTOTCPQ 5900 
GCATCTCCTC AGCAGAGAGC CTGAAGATAT CTCAAGCTGT CCAT6CAGCA 5950 
CATGCA6AAA TCAATGAAGC AGGCA6AGAG GTGGTAGG6T CAGCAGAGGC 6000 
TGGAGTGGAT GCTGCAAGCG TCTCT6AA6A ATTTA60GCT GACCATCCAT 6050 
TCCTCTTCTO TATCAAGCAC ATCGCAACCA ACGCCGTrCT CTTCTTTGGC 6100 
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AGATGTGTTT CCCCTC06C6 GCCAGCAGAT GACGCftCCAG CAGAT6ACGC 6150 
ACCAGCAQAT SACQCACCA6 CA6IVTGAC6C ACCA6CAQAT GAC6CACCA6 6200 
CA^TGAOGC AACMCAT6T ATCCTGAAAG GCTCTT6TGG CTGQATCGGC 6250 
CTGCTGQAT6 ACQATGACAA AAAATACAAA AAAGCACTGA AAAAACT86C 6300 
AAAACTGCTG TAATGAGGGC 6CCT6GATCC AGATCACTTC TGGCTAATAA 6350 
AAGATCAGAG CTCTAGAGAT CTGTGTGTTG GTTTTTTGTG GATCTGCTGT 6400 
GCCTTCTAQT TGCCAGCCAT CTGTTGTTTG CCCCTCCCCC GTGCCTTCCT 6450 
TGACCCTGGA AGGTGCCACT CCCACTGTCC TTTCCTAATA AAATGAGGAA 6500 
ATTGCATCGC ATTGTCTGAG TAGGTGTCAT TCTATTCTGG GGGGTGGG6T 6550 
GGGGCAGCAC AQCAAGG6GG AGGATTGGGA AGACAATAGC AGGCATGCTG 6600 
GGQATGCGGT GGGCTCTATG GGTACCTCTC TCTCTCTCTC TCTCTCTCTC 6650 
TCTCTCTCTC TCTCTCGGTA CCTCTCTCGA GGGG6GGCCC GGTACCCAAT 6700 
TCGCCCTATA GTGAGTCGTA TTACGC6CGC TCACTG6CC6 TCGTTTTACA 6750 
ACGTOGTGAC TG6GAAAAOC CT6G06TXAC CCAACTTAAT C6CCTT6CA0 6800 
CACATCCCCC TTTCQCCAGC TGGCGTAATA GCQAAGA6GC CCGCACCQAT 6850 
CGCCCTTCCC AACASITGOO CA6CCTQAAT GGCGAATB6A AATrGTAAGC 6900 
OrrAATAXTT TGTTAAAATT CGCGTTAAAT TTTT8TTAAA TCASCTCATT 6950 
TTTTAACCAA TAGGCCGAAA TCGGCAAAAT CCCTTATAAA TCAAAAQAAT 7000 
AGACCGAQAT AGGGTTGAGT GTTGTTCCAG TTTGGAACaA GAGTCCACTA 7050 
TTAAAGAACG TGGACTCCAA CGTCAAAGGO C6AAAAACCX3 TCTATCAGGG 7100 ' 
CGATGGCCCA CTACTCCGGG ATCATATGAC AA6ATGTGTA TCCACCTTAA 71S0 
CTTAATQATT TTTACCAAAA TCATTAC3GGG ATTCATCAGT GCTCAGGGPC 7200 
•AACQAOAATT AACATTCCGT CAGGAAAGCT TATGATGAT6 ATGTGCTTAA 7250 
AAACTTACrC AATGGCTGGT TATGCATATC GCAATACATO CGAAAAACCT 7300 
AAAAOAGCXT GCCGATAAAA AAGGCCAATT TATTGCTATT TACCGCGGCT 73 SO 
TTTTATTSAa CITGAAAQAT AAATAAAATA QATAGGTTTT ATrTGAAGCT 7400 
AAAICTTCTT TATCGTAAAA AAT6CCCTCT TGOGTTATCA AGA6GGTCAT 7450 
TAZATTZCGC GQAAXAACAT CATTTGGTGA CGAAATAACT AAGCACTTOT 7500 
CTCCT U T TTA CTCCCCTQAS.- CTTGAGGGGT TAACATQAA6 GTCATCQATA 7550 
GCAGGATAAT AATACAGTAA AA06CTAAAC CAATAAOCCA AATCCA6CCA 7600 
TQCCAAATTG GTA6TGAATG ATTATAAATA ACA6CAAACA GTAATGGGCC 7650 
AATAACACCG GTT6CATTGG TAAGGCTCAC CAATAATCCC T6TAAA6CAC 7700 
C T TG CTOATQ ACTCTTTGTT TGGATAGACA TCACTCCCTG TAATGCA6GT 7750 
AAAGOGATCC CACCACCAGC CAATAAAATT AAAACAGGGA AAACTAACCA 7B00 
ACCTTCAQAT ATAAACGCTA AAAAGGCAAA TGCACTACTA TCTGCAATAA 7850 
ATCCGAGCA6 TACTGCCGTT TTTTCGCCCC ATTTAGTGGC TATTCTTCCT 7900 
GCCACAAAGG CTT6GAATAC TGAGTGTAAA AGACCAAGAC CCGCTAATGA 7950 
AAAGCCAACC ATCATGCTAT TCCATCCAAA ACGATTTTC6 6TAAATAGCA 8000 
CCCACACCGH TGCGGGAATT TGGCCTATCA ATT6CGCTGA AAAATAAATA 8050 
ATCAACAAAA TGGCATCOTT TTAAATAAA6 TSATGTATAC 0SAATTCA6C 8100 
TTTTGTTCCC TTTAGTGAGG GTTAATTGCG C6CTTGGC6T AATCATGGTC 8150 
ATAGCTOTTT CCT6TGTGAA ArrGTrATCC GCTCACAATT CCACACAACA 8200 
TACOAGCCGQ AAQCATAAAQ T6TAAAGCCT GGGOTGCCTA AIGnGTGAiSC 8250 
TAACTCACAT TAATTGCGTT GCGCTCACT6 CCCGCTTTCC AGTCG6GAAA 8300 
CCTQTCGTGC CA6CTGCATT AATGAATCG6 CCAACGCCCXS GGGAQAGGCG 8350 
GTTTGCGTAT TGGGCGCTCT TCCGCTTCCT CGCTCACTGA CTCGCTGCGC 8400 
TCGGTCGTTC GGCZGCGGCG AGC6GTATCA GCTCACTCAA AG6CG6TAAT 8450 
ACGGTTATCC ACAGAATCAG GGGATAACXK: AGGAAAGAAC ATGTGAGCAA 8500 
AAGGCCAQCA AAAGGCCAGG AACCGTAAAA AGGCCGC6TT GCTGGCGTTT 8550 
TTCCATAGGC TCCGCCCCCC TGACGAGCAT CACAAAAATC 6ACX3CTCAA6 8600 
TCAGAGGIG6 C6AAACCCQA CAGGACTATA AAGATACCA6 GCGITrCCCC S6S0 
CTG6AA6CTC CCTCGT6(^ TCTCCTGTTC CGACCCTGCC GCTTACCGSA 8700 
TACCTGTCCG CCTXTCTCCC TTCXSGGAAGC GTGGCGCTTT CTCAXA6CTC 8750 
ACOerrSTAGQ TATCTCAGTT CQGTGTAGGT C6TTCGCFCC AAGCT6G6CT 8800 
GTGTGCACGA ACCCCCCGTT CAGCXTCGACC GCTGCGCCTT ATCCGGTAAC 8850 
TATCQTCTTG AGTCCAACCC GGTAAGACAC GACTTATCGC CACTG6CAGC 8900 
AGCCACTGGT AACAGGATTA 6CAGA6CGAG 6TATGTAGGC GGTGCTACA6 8950 
AGTTCTTOAA GTGGTGGCCT AACTACGGCT ACACTAGAA6 GACAGTATTT 9000 
GGTATCTGCG CTCTGCTGAA GCCAGTTACC TTCGGAAAAA GAGTTGGTAG 9050 
CTCITGATCC 6GCAAACAAA CCACCGCTGS TAfiCJGGTGGT TTTTTTGTTT 9100 
GCAAGCAGCA GATTACGCGC AGAAAAAAA6 6ATCTCAAGA ASATCCTTTQ 915^ 
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ATCTTTTCTA CGGGGTCTCA CGCTCAGTGG 
GATTTTGGTC ATGAGATTAT CAAAAAGGAT 
ATTAAAAAT6 AA6TITTAAA TCAATCTAAA 
TCT6ACAGTT ACCAATGCTT AATCAGTGAG 
TCTATTTCGT TCATCCATAO TTGCCIGACT 
CGATA0GG6A G66CTTACCA TCTGGCCCCA 
GACCCACGCT CACCGGCTCC AGATTTATCA 
AAGGGCCGA6 CX3CAGAAGT6' GTCCTGCAAC 
CTATTAATTG TTGCCGGGAA GCTAGAGTAA 
TTGCGCSiACG TTGTTGCCAT TGCTACAGGC 
GTTTGGTATG GCTTCATTCA GCTCCGGTTC 
CATGATCCCC CAIGXT6TGC AAAAAAGCGG 
ATCGfTT6TCA OAAGTAAGTT GGCCGCAGTG 
A6CACTBCAT AATTCTCTTA CTGTCATGCC 
TBACTOBTGA 6TACTCAACC AAGTCATTCT 
CCGAGnOCT CTTGCCCQGC GTCAATACGG 
CAGAACrrTA AAAiGTGCTCA TCATTG6AAA 
TCTCAA66AT CTTACCGCTG TTGAGATCCA 
GCACCCAACT GATCTTCAGC ATCTTTTACT 
AGCAAAAACA GGAAG6CAAA ATGCCGCAAA 
GGAAATGTTG AATACTCATA CTCTTCCTTT 
TATCAGGGTT ATTGTCTCAT GAGCGGATAC 
AAATAAACAA AXAGGGGTTC CGCGau:ATT 



AACGAAAACT CACGTTAAGG 9200 
CTTCACCTAG ATCCTTTTAA 92S0 
GTATATATGA GTAAACTtGG 9300 
GCACCTATCT CAGCGATCTO 9350 
CCCCGTCGTQ TAGATAACTA 9400 
GTGCTGCAAT GAIACCGCGA 9450 
GCAATAAACC AGCCAGCCX3G 9500 
TTTATCCGCC TCCATCCAGT 9550 
GTAGTTCGCC AGTTAATAGT 9600 
ATCGTGGTGT CACGCTCGTC 9650 
CCAACGATCA AGGCQAGTTA 9700 
TTA6CTCCTT CGGTCCTCCQ 9750 
TTATCACTCA TGGTTATGGC 9800 
ATCCGTAAGA T6CITTTCTG 9850 
GAGAATACrra TAT0CGG06A 9900 
GATAATACCG C6CCACATA0 9950 
ACXSTTCTTCG GGQCGAAAAC 10000 
GTTCGATGTA ACCCACTCQT 10050 
TTCACCAGCG TTTCTGGGTG 10100 
AAAGGGAATA AGGGCGACAC 10150 
TTCAATATTA TTGAAGCATT 10200 
ATATTTGAAT GTATTTAGAA 10250 
TCCCCGAAAA GT6CCAC 10297 



SBQ ID N0346 (pTnKodCOval/ENT Cag/P146/PA) - QUAIL) 

CTGACGCGCC CTGTAGCGGC GCATTAAGCG CGGCGGGTGT GGTGGTTACG SO 
CGCA6CQTGA CCGCTACACT TGCCAGCGCC CXAQCGCCCO CTCCTTTCGC 100 
TTTCTTCCCT TCCTTTCTCG CCACGTTCGC CGGCATCAGA TTGGCTATTG 150 
GCCATT6CAT AC6TTGTATC CATATCATAA TATGTACATT TATATTGGCT 200 
CATGTCCAAC ATTACCGCCA TGTTGACATT GATTATTGAC TAGTTATIAA 250 
TAGTAATCAA TXAC66GGIC ATTA6TTCAT A6CCCATATA TGGAGTTCOQ 300 
OGTTACAXAA CTTAOGGTAA ATGGCCC6CC TG6CT6ACC6 CCCAACGACC 350 
CCC6CCCATT GAOGTCAAZA AT6ACGIAT6 TTCCCAXAQT AAC60CAATA 400 
GGGACTTTCC ATT6ACGTCA ATGGGTGQAG TATTTAOGOT AAACTQCCCA 450 
CTTGGCAGTA' CATCAAGTGT ATCATATGCC AAGTACGCCC CCTATTGAC6 500 
TCAATGAC6G TAAATGGCCC GCCTOGCATT ATGCCCAGTA CATQACCTTA 550 
TGGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA TCGCTATTAC 600 
CATGGTGATG CGGTTTTGGC AGTACATCAA TGGGCGTGGA TAGCGGTTTQ 650 
ACTCACGGGG ATTTCCAAGT CTCCACCCCA TTGACGTCAA TGG6AGTTTG 700 
TTTTGGCACC AAAATCAACG GGACTTTCCA AAATGTCXSTA ACAACTCCGC 750 
(X!CATTGACG CAAATGGGCG GTA6GCGTGT ACG6TGGGAG 6TCTATATAA 800 
6CA6AGCTCX3 TTTAGTGAAC CGTCA6ATC6 CCTG6AGACG CCATCCACGC 850 
tGTTTTGACC TCCATA6AA6 ACACCGGGAC CGATCCAGCC TC0606GCC6 900 
GGAA0GGX6C ATTG6AACGC GGATTCCCCG TGCCAAGAOT GAC6IAA6TA 950 
CC GCCTM AG ACTCTATAG6 CACACCCCTT TOGCTCTIAT GCATGCTAXA 1000 
CTGrXTrrTGG CTTGGGGCCT ATACACCCCC GCTTCCTTAT GCTATAGGTG 1050 
ATGGTATAQC TTAGCCTATA GGTGTGGGTT ATTGACCATT ATTGACCACT 1100 
CCCCTATTGQ TGACGATACT TTCCATTACT AATCCATAAC ATGGCTCTTT 1150 
GCCACAACIA TCTCTATTGG CTATATGCCA ATACTCTGTC CTTCAGAGAC 1200 
TGACAC66AC TCT6TATTTT TACAGGATGG GGTCCCATTT ATTATTXACA 1250 
AATTCACATA TACAACAAC6 COGTCCCCCG TGCCCGCAGT TTTTATTAAA 1300 
CATAGCGTGG GAICTCCACG C6AATCT060 GTAC6TGTTC CXSGACATG66 1350 
CTCTTCTCCG GTA6CGGCGG AGCTTCCACA TCCGAGCCCT GGTCCCATQC 1400 
CTCCAGCG6C TCATG6TCQC TC6GCAGCTC CTT6CTCCTA ACAGIGGAGO 1450 
CCAGACmG GCACAGCACA ATGCCCACCA CCACCAGIGT GC06CACAA6 1500 
GCCGriGQCBQ TAG6GTATGT GTCTGAAAAT GAGC6TGGAQ ATTGGGCTCG 1550 
CAC6GCT6AC GCAGATGGAA GACTTAAGGC AGCG6CAGAA GAA6ATGCAG 1600 
GCAGCTGAGT TOCTCTATTC TGATAAGAGT CAGAGGTAAC TCCCGITGCG 1650 
GT6CT6TTAA C6QTGGAGGQ CA6TGTAGTC TGA6CA6TAC TCOTTGCTGC 1700 
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CGCGCGCGCC ACCAGACATA ATAGCTGACA 6ACTAACAGA CTGTTCCTTT 1750 
CCATGGGTCT TTTCTCCAGT CACCGTCGGA CCATGTGTGA ACTTGATATT 1800 
TTACATGATT CTCTTTACCA ATTCTGCCCC GAATTACACT TAAAACGACT 1850 
CAACAGCTTA ACGTTGGCTT GCCACGCATT ACTTGACTGT AAAACTCTCA 1900 
CTCTTACCGA ACTTGGCCGT AACCTGCCAA CCAAAGCGAG AACAAAACAT 1950 
AACATCAAAC 6AATCGACCG ATTGTTAG6T AATCGTCACC TCCACAAAGA 2000 
GCGACT06CT GTATACC6IT GGCATGCTAG CTTTATCTST TCGGGAATAC 2050 
6ATGCCCATT aiACTTSTTG ACTGGTCTGA TATTCGTGAG CAAAAAC6AC 2100 
TTATGGTATT 60GA6CTTCA GTCGCACTAC ACGGTCGITC TGTTACTCTr 2150 
TATGAGAAAG 06TTCC0QCT TTCAGAGCAA T6TTCAAAGA AAGCTCATGA 2200 
CCAATTTCTA GCCQACCTTG CX3VGCATTCT ACCQAOTAAC ACCACACC6C 2250 
TCATTGTCA6 TGATGCTGGC TTTAAAGTGC CATGGTATAA ATCCGTTGAG 2300 
AAGCTGGGTT GGTACTGGTT AAGTCGAGTA AGAGGAAAAG TACAATATGC 2350 
AGACCTAGGA GCGGAAAACT GGAAACCTAT CAGCAACTTA CATGATATGT 2400 
CATCTAGTCA CTCAAAGACT TTAGGCTATA AGAGGCTGAC TAAAAGCAAT 2450 
CCAATCTCAT GCCAAATTCT ATTGTATAAA TCTCGCTCTA AAGGCCGAAA 2500 
AAATCAGCGC TCQACAC66A CTCATT6TCA CCACC06TCA CCTAAAATCT 2550 
ACTCAGOOrC GGCAAAOGaG CCATGGGTTC TAGCAACIAA CTTACCTGTT 2600 
GAAATT06AA CACCCAAACA ACTTOrTAAT ATCTATTCGA ABCQAATOCA 2650 
GATTQAASAA ACCTTCOGAG ACTTGAAAA6 TCCTGCCTAC G6ACTAGGCC 2700 
•TAC6CCATA0 CCGAAC6A6C A6CTCAGAGC GTTTTGATAT CATGCTGCTA 2750 
ATC6CCCTGA TQCTTCAACT AACATGTTGG CTTGCGGGCG TTCATGCTCA 2800 
GAAACAAG6T TGGGACAAGC ACTTCCA6GC TAACACAGTC AGAAATCGAA 2850 
ACGTACTCTC AACAGTTCGC TTAGGCATGG AAGTTTTGCG GCATTCTGGC 2900 
TACACAATAA CAAGGGAAGA CTTACTCGTG 6CTGCAACCC TACTAGCTCA 2950 
AAATTTATTC ACACATGGTT ACGCTTTGGG GAAATTATGA TAATGATCCA 3000 
GATCACTTCT GGCTAATAAA AGATCAGAGC TCTAGAGATC TGT6TGTTGG 3050 
TTTTTTGTGG ATCTGCTGTG CCTTCTAGTT GCCAGCCATC TGTTGTTTGC 3100 
CCCTGCCCC6 TGCCTTCCTT GACCCTGGAA GGT6CCACTC CCACTGTCCT 3150 
TTCCTAAIAA AAT6AGGAAA TTGCATCGCA TTGTCOXSAGT AGGTGTCATT 3200 
CIATECTGG6 GG6T0GQGT6 GGGCA6CACA GCAAGGGGQA GGATTGGQAA 3250 
GAC3MXAGCA GGCATGCTQ6 G6ATGCGGIG GGCTCIATGQ GTACCTCTCT 3300 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT' CTCTCGOXAC CTCTGTCTCT 3350 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CG6TACCAG6 TGCTGAAGAA 3400 
TTQACCCGGT GACCAAAG6T GCCTTTTATC ATCACTTTAA AAATAAAAAA 3450 
CAATTACTCA GTGCCTGTTA TAAGCAGCAA TTAATTATGA TTGATGCCTA 3500 
CATCACAACA AAAACTGATT TAACAAATGO TTGGTCTGCC TTAGAAAGTA 3550 
TATTTQAACA TTATCTTGAT TATATTATTQ ATAATAATAA AAACCTTATC 3600 
CCTATCCAAG AAGTGATGCC TATCATTGGT TGGAATGAAC TTGAAAAAAA 3650 
TTAGCCTT6A ATACATTACT GGTAAGGTAA ACGCCATTGT CAGCAAATTG 3700 
ATCCAAGAGA ACCAACTTAA AGCTTTCCTG ACG6AATGTT AATTCTCGTT 3750 
GACCCTQAGC ACT6AT6AAT CCCCTAATGA TTTTG6TAAA AATCATTAAG 3800 
TTAAGGTGOA TACACATCTT GTCATATGAT CCC6GTAATG TGAiSTTAGCT 3850 
CACICAITAG GCACCCCA6G CTTTACACTT TAT6CTTC08 6CTG8TATGT 3900 
T8TGT8GAAT TGTSAGCGQA TAACAATTTC ACACAGGAAA CAGCTATGAC 3950 
CATGATTACG CCAA6C6CGC AATTAACCCT CACTAAAGGG AACAAAAOCT 4000 
G6AGCICCAC CGCGGTGOCO GCCGCTCTAG AACTAGTGGA TCCCCC6GGG 4050 
AGGTCAGAAT GGTTTCTTTA CTGTTTGTCA ATTCTATTAT TTCAATACAG 4100 
AACAAAAGCT TCTATAACTG AAATATATTT GCTATTGTAT ATTAIGATTG 4150 
TCCCTCGAAC CATGAACACT CCTCCAGCTG AATTTCACAA TTCCTCTGTC 4200 
ATCTGCCAGG CTGGAAGATC ATGGAAGATC TCTGAGGAAC ATTGCAAGTT 4250 
CATACCATAA ACTCATTTGG AATTGAGTAT TATTTTGCTT TGAATG6AGC 4300 
TATGrnTGC AGrrCCCTCA GAAGAAAA6C rrGTTATAAA GCGTCIACAC 4350 
CCATCAAAAG ATATATTTAA ATATTCCAAC TACAGAAAGA TTTTGTCT6C 4400 
TCTTCACTCT GATCTCAGTT GGTTTCTTCA OGTACATQCT TCTTTATTTS 4450 
CCTATTTTSr CAAQAAAATA ATAGGTCAA6 TCCT6TTCTC ACTTATCTCC 4500 
TGCCTAGCAT GGCTTAGATG CACGTTGTAC ATTCAAGAA6 GATCAAATGA 4550 
AACAGACTTC TGGTCTGTTA CAACAACCAT AGTAATAAAC AGACTAACTA 4600 
ATAATTGCTA ATTATGTTTT CCATCTCTAA GGTTCCCACA TTTTTCTGTT 4650 
TTAAGATCCC ATTATCTGGT TGTAACT6AA 6CTCAATGGA ACATGAACAG 4700 
TATTTCTCA6 TCTTTTCTCC AGCAATCCTQ ACGGATTAiSA AGAACTGGCA 4750 
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GAAAACACTT TGTTACCCAG AATTAAAAAC TAATATTTGC TCTCCCTTCA 4800 
ATCCAAAATG GACCTATXGA AACTAAAATC TGACCCAATC CCATTAAATT 4850 
ATTTCIATCG CCTCAAAGGT CAAACTTTTG AAlGGGAACCT 6TGG6IGGGT 4900 
CCCAATTCAG GCTATATATT CCCCAGGGCT CAGCCAGTG6 ATCCATGGQC 4950 
TCCA.T0G6TG CAGCAAGCAT G6AATTTT6T TTTGATGTAT TCAAG6AGCT SOGO 
CAAAGTCCAC CATGCCAATG ACAACATGCT CTACTCCCCC TTTSCCATCT 5050 
TGTCAACTCT GGCCATGGTC TTCCTAGGTG CAAAAGACA6 CACCAGGACC SlOO 
CAGATAAATA AGGTTGTTCA CTTTGATAAA CTTCCAGGAT TCGGAGACAO 5150 
TATTQAAGCT CAGTGTGGCA CATCTGTAAA TGTTCaCTCT TCACTTAGAQ 5200 
ACATACTCAA CCAAATCACC AAACAAAATG ATGCTTATTC GTTCAGCCTT 5250 
GCCAGTAGAC TTTATGCTCA A6AGACATAC ACAGTCGTGC CGGAATACTT 5300 
GCAATGTOTG AAGGAACTGT ATAGAGGAGG CTTAGAATCC GTCAACTTTC 5350 
AAACAGCTGC AGATCAAGCC AGAGGCXTTCA TCAATGCCTG GGTAGAAAGT 5400 
CA6ACAAACG GAATTATCA6 AAACATCCTT CAGCCAAGCT CC6TGGATTC 5450 
TCAAACT6CA ATGGTCCIGO TTAATQCCAT TGCCTTCAAO GGACT6TGG6 5500 
AGAAAGCATT TAA6GCT6AA 6ACAC6CAAA CAATACCTTT CA6AGTGACT 5550 
GAGCAAGAAA GCAAACCTGT GCAGATGATG TACCAGATtG GTTCATTTAA 5600 
AGT66CATCA ATG6CTTCT6 AGAAAATGAA GATCCT6GAG CTTCC!ATTTG 5650 
CCA0TGGAAC AATGAGCATG TTG6TGCTGT- TGCCTGATGA TOTCTCAGGC 5700 
CTTGAGCAGC TTGAGAGTAT AATCAGCTTT GAAAAACT6A CTGAATGGAC 5750 
CAGTTCTAGT ATTATGGAAG AGAGGAAGGT CAAAGTGTAC TTACCTCGCA 5800 
TGAAGATGGA GGAGAAATAC AACCTCACAT CTCTCTTAAT GGCTATGGGA 5850 
ATTACTGACC TGTTCAGCTC TTCAGCCAAT CTGTCTGGCA TCTCCTCAGT 5900 
AGGGA6CCTG AAGATATCTC AAGCTGTCCA TGCAGCACAT GCAGAAATCA 5950 
ATGAAGCGGG CAGAGATGTG GTAGGCTCAG CAGAGGCTGG AGTGGATGCT 6000 
ACTGAAGAAT TTAGGGCTGA CCATCCATTC CTCTTCTGTG TCAAGCACAT 6050 
CGAAACCAAC GCCATTCTCC TCmGGCAG ATGIGTTTCT CCGCGGCCAQ 6100 
CA6AXGA06C ACCAGCA6AT GAC6CACCA6 CAGAT6AC6C ACCA6CAGAT 6150 
GA06CACCA6 CAGATGACGC ACCAGCAGAT GAC6CAACAA CATfSTATCCT 6200 
GAAAG6CXCT TGIGGCTG6A TCG6CCIGCT GGATGACGAT GACAAAAAAT 6250 
ACAAAAAAfiC ACTGAAAAAA CTQGCAAAAC TGCTGTAATG AGGGCGCCT6 6300 
GATCCAGATC ACTTCTGGCT AATAAAAGAT 'CAGAGCTCTA QAGATCTSTG 6350 
TGTTGGTTTT TTGTGGATCT GCTGTGCCTT CTAGTTGCCA GCCATCTGTT 6400 
GTTT6CCCCT CCCCC6TGCC TTCCTTGACC CTGGAAGGTQ CCACTCCCAC 6450 
TGTCCTT T CC TAATAAAATO AGGAAATTGC ATCGCATTGT CTGAGTAGGT 6500 
GTCATTCTAT TCTGG6GGGT GGGGTGGGGC! AGCACAGCAA GGGGGA6GAT 6550 
tGGGAAGACA ATAGCAGCCA TGCTCGGGAT GCGGTGGGCT CTATGGGTAC 6600 
CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CTCTCTCTCT CGGTACCTCT 6650 
CTCGAGGGGG GGCCCGGTAC CCAATTCGCC CTATAGTGAG TCGTATTACG 6700 
CGOGCTCACT GGCCGTCGTT TTACAACGTC GTGACTGGGA AAACCCTGGC 6750 
GTTACCC3WVC TTAATCGCCT TGCAGCACAT CCCCCTtTCG CCAGCTGGCG 6800 
TAATAG08AA GAGGCCCGCA CCGATCGCCC TTCCCAACAO TTGCGCAGCC 6850 
TGAATGGOGA ATG6AAATT6 TAA6CGTIAA TATTTTGTTA AAATTC6CGT 6900 
TAAATTTTTG TTAAATCAGC TCATTTTTTA ACCAATAGGC CGAAATCGGC 6950 
AAAATCCCTT ATAAATCAAA AGAATA6ACC GAGATAGG6T TGAGTGTT8T 7000 
TOCag m 'GG AACAAGAGIC CACtATTAAA GAACGTGGAC TCCAAC6TCA 7050 
AAGG6CGAAA AACCOTCTAT CAGG6CQAT6 GCCCACTACT CCGGGATCAT 7100 
ATGACAAGAT GTGTATCCAC CTTAACTTAA TGATTTTTAC CAAAATCATT 7150 
AGGGGATTCA TCAGTGCTCA GGOTCAACGA GAATTAACAT TCCGTCAGGA 7200 
AAGCTTATGA TGAT6ATQTG CTTAAAAACT TACTCAATGG CTGGTTATGC 7250 
ATATCGCAAT ACATGCGAAA AACCTAAAAG AGCTTGCC6A TAAAAAAGGC 7300 
CAATTTATTG CTATTTACCG CGGCTTTTTA TTGAGCTTGA AAGATAAATA 7350 
AAATAGATA6 GrmATTTG AAGCTAAATC TTCTTTATCG TAAAAAATGC 7400 
CCT C rrGG CT TATCAASAG6 GTCATTATAT TTCGCGGAAT AACATtATTT 7450 
G6TGACGAAA lAACTAAGCA CTT6TCTCCT GTTTACTCCC CTGA6CTTGA 7500 
GGGGTTAACA TGAAGGTCAT CGATA6CAG6 ATAATAATAC AGTAAAACGC 7550 
TAAAC!CAATA ATCCAAATCC A6CCATCCCA AATTGGTAGT GAATGATTAT 7600 
AAATAACA6C AAACAGZAAT GQSCCAATAA CACCGGTTGC ATTGGTMG6 7650 
CTCACCAATA ATCCCTGTAA AGCACCTTGC TGATGACTCT TTGTTT6QAT 7700 
AGACATCACT CCCTGTAATG CAGGTAAA6C GATCCCACCA CCAGCCAATA 7750 
AAATtAAAAC AG66AAAACT AACCAACCTT CAGATATAAA CGCXAAAAA6 7800 
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GCAAAT6CAC TACTATCTGC AATAAATCCG RGCAGTACTG CCGTTTTTTC 7850 
GCCCCATTTA GTGGCTATTC TTCCTGCCAC AAAGGCTTG6 AATACTGAGT 7900 
GTAAAAGACC AAGACCCGCT AATGAAAAGC CAACCATCAT GCTATTCCAT 7950 
CCAAAAC6AT TTTCGGTAAA TAGCACCCAC ACCCjTTGCGG GAATTTGGCC 8000 
TATCAATTGC GCTGAAAAAT AAATAATCAA CAAAATG6CA TCGTTTTAAA 8050 '■ 
TAAAGTQATG TATACCGAAT TCAGCTTTTG TTCCCTTTAG T6AGGQTTAA 8100 
TTGCGCGCTT GGCGTAATCA TGGTCATAGC TGTTTCCTGT CTGAAATTGT 8150 
TATCCX3CTCA CAATTCCACA CAACATACGA GCCGGAA6CA TAAAiGTGTAA 8200 
AGCCTGGGGT GCCTAATGAG TGAGCTAACT CACATTAATT GCX5TTGCGCT 8250 
CACT6CC06C TTTCCAGTCO GGAAACCT6T OGTGCCSVGCT GCATTAATQA 8300 
ATCG6CCAAC GCGOGGGGAG AGGCG6TTTG CGTATT6GGC GCTCTTCCGC 8350 
TTCCTCGCTC ACTQACTCGC TGCGCTCGGT CGTTCGGCTG CGGCGAGCGQ 8400 
TATCAGCTCA CTCAAAGGCG GTAATACGGT TATCCACAGA ATCAGGGGAT 8450 
AAC6CAGGAA A6AACATGTG AGCAAAAGGC CAGCAAAAGG CCAGGAACCG 8500 
TAAAAAGGCC GCGTTGCTGG CGTTTTTCCA TAGGCTCCGC CCCCCTGACG 8550 
AGCATCACAA AAATCGACGC TCAAGTCAGA GGTGGCGAAA CCCGACAGGA 8600 
CTATAAAGAT ACCAGGCGTT TCCCCCTGGA AGCTCCCTCG TGCGCTCTCC 8650 
TCXTCCGACC CTGCOGCTTA CCGQATACCT GTCCGCCTTT CTCCCTTCGG 8700 
GAAGCGTeGC 6CTTTCTC»T AGCTCACGCT 'GTAGGTATCT CA(nrTCGGT6 8750 
TAG6TC6TTC GCTCCAAQCT GGGCT6TGTQ CACGAACCCC CCGTTCAGCC 8800 
CGACCGCTQC 6CCTIATCCG GTAACTATC6 TCTrOAGTCC AACCCGGTAA 8850 
GACACGACTT ATC6CCACTO 6CAGCAGCCA CIQGTAACA6 6ATTA6CAQA 8900 
GCGAGOTATO TAGGCGGT6C TACAGAGTTC TTGAAGTGGT GGCCTAACTA 8950 
CGGCTACACT AGAAGGACAG TATTTGGTAT CTGCGCTCT6 CTGAAGCCA6 9000 
TTACCTTC3GG AAAAAGAGTT GGTAGCTCTT GATCCGGCAA ACAAACCACC 9050 
GCTGGTAGCG GTGQTTTTTT TGTTTGCAAG CAGCASATTA CGCGCAGAAA 9100 
AAAAGGATCT CAAGAAGATC CTTTGATCTT TTCTACGGGO TCTGACGCTC 9150 
A6T6GAACGA AAACTCACGT TAAGGGATTT TGGTCATGAO ATTATCAAAA 9200 
AGGATCTTCA CCTA6ATCCT TTTAAATTAA AAATGAAGTT TTAAATCAAT 9250 
C7AAASTATA TATGaUSIAAA CTTGGTCTGA CA6TTACCAA tTGCTTAATCA 9300 
GTGAGGO^ TATCTCAGOG ATCTGTCTAT TTCGTTCATC CATAGTTGCC 9350 
TQACTCCCCXS TCQTSTAGAT AACTA06ATA CaQGAGGGCT TACCATCTG6 9400 
CCCCAGTOCT 6CAATGATAC 06C6A6ACCC AC6CTCACC6 6CTCCAGAIT 9450 
TATCA6CAAT AAACCAGCCA 6CCGGAAGGQ CCGAGO6CA0 AA6TG6TCCT 9500 
GCAACTTTAT CCeCCTCCAT CCAGTCTATT AATTQIT6CC GGGAAGCTAO 9550 
AGTAAGTAGT TC6CCAGTTA ATAGTTTGCG CAACGTT6TT GCCATTGCTA 9600 
CAGGCATC6T GGTGTCACGC TCGTCGTTTG GTATGGCTTC ATTCAGCTCC 9650 
66TTCCCAAC GATCAA6GCG AGTTACATGA TCCCCCATGT T6TGCSVAAAA 9700 
AGCG6TTAGC TCCTTCGGTC CTCCGATC6T TGTCAGAAGT AAGTTGGCCG 9750 
CAGTGTTATC ACTCATGGTT ATGGCAGCAC TGCATAATTC TCTTACTGTC 9800 
AT6CCATCCG TAAGATGCTT TTCT6TGACT 66TGAGTACT CAACCAAGTC 9850 
ArrCTGAOAA TA6T6TATQC GGOGACCGAG TTGCTCTTGC CC6GC6TCAA 9900 
TACXSGGATAA TACCGCQCCA CATAGCAGAA CTTTAAAAGT 6CTCATCATT 9950 
GGAAAAOGTT CTTCGGGGCG AAAACTCTCA AG8ATCTTAC CGCTGrrOAO 10000 
ATCCAGTTG6 ATOTAAOCCA CTCGIGCACC CAACTGATCT TCAGCATCTT 10050 
TTACTTTCAC CAGCGITTCT 6GGIGAGCAA AAACAGGAAG GCAAAAT6CC 10100 
GCAAAAAAGQ GAATAAGGGC GACACGGAAA TGTTGAATAC TCATACTCTT 10150 
CCTTTTTCAA TATTATTGAA 6CATTTATCA GGGTTArrGT CTCATGA6CG 10200 
QATACATaTT TGAATGTATT TAGAAAAATA AACAAATA6G GGTTCCGCX3C 10250 
ACATTTCCCC GAAAAGT6CC AC 10272 
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SEQ ID N0:47 pTloMCS (CMV-CHOV&-ent-ProInsulin-BynPA) 

1 ctgacscgcc etgtagcggc gcattaagog cggcgggtgt ggcggttacg cgcagcgtga 
63. cegctacact tgccagcgcc ctagcgeecg etcctttcgc tttcttccct tcatttetcg 
121 ccacgttcgc cggcatcaga ttggctattg gccatcgcaC acgttgtatc catatcataa 
181 tatgtacatt tatattggct catgtccaac attaccgcca tgttgacatt gattattgac 
241 tagttattaa tagtaatcaa ttacggggtc attagtteat agcccatata tggagttccg 
301 cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccact 
361 gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca 
421 atgggtggag tatttacggt aaactgccca cttggcagta eatcaagcgt atcatatgcc 
4B1 aagtacgccc cctattgacg tcaatgacgg taaatgscce gcctggeatK atgeecagta 
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541 catgacctta tgggactttc ctacttggca 
601 catggcgatg cggttttggc agtacaccaa 
661 atttccaagt ctccacccca ctgacgccaa 
721 ggactttcca aaatgtcgta acaactccgc 
781 acggtgggag gtctatataa gcagagctcg 
841 ccatccacgc tgttttgacc tccatagaag 
901 ggaacggtgc attggaacgc ggactccccg 
961 actctatagg cacacccctt tggctcctac 
loai atacaccccc gctcccttat gctataggtg 
1081 attgaccact attgaccact cccctaCtgg 
1141 acggcccCCt gccacaacta cctctattgg 
1201 tgacacggac tctgtatttt tacaggatgg 
1261 tacaacaacg ccgtcccccg tgcccgcagc 
1321 cgaatctcgg gtacgCgttc cggacatggg 
1381 tccgagccct ggtcccatgc ctccagcggc 
1441 acagtggagg ccagacttag gcacagcaca 
1501 gccgtggcgg tagggtatgc gCccgaaaat 
1561 gcagatggaa gacttaaggc agcggcagaa 
1631 tgataagagt cagsggtaac tcccgttgcg 
1681 tgagcagtac tcgttgctgc cgcgcgcgcc 
1741 ctgttecttt ccatgggtct tttctgcagt 
1801 Ctacacgact. ctcttcacca actctgcccc 
1861 acgttggctt gccacgcatc acttgactgt 
1921 aacctgccaa ccaaagcgag aacaaaacat 
1981 aatogteaec tecacaaaga gcgactcgct 
2041 tcgggeaata egatgcecat tgtacttgtt 
2101 cttatggtat tgogagcttc agccgcacta 
2161 gcgtteccge tttccigagea acgtteaeug 
2321' gegagcattc taecgagtaa eaiecaeaccg 
2281 ccatggtata' aateogttga gaagctgggt 
2341 gtacaatatg eagacetagg agcggaaaac 
2401 tcatctagtc actcaaagac Cttaggctat 
2461 tgccaaattc taCCgtataa acctcgctcc 
2521 actcattgto accacccgtc acctaaaatc 
2581 etagiaacta acttacctgt tgaaattcga 
2641 aagcgaatgc agattgaaga aaccctccga 
2701 ctacgccata gccgaacgag cagctcagag 
2761 aCgcttcaac taacacgctg gcttgcgggc 
'2821 cacttccagg ctaacacagt cagaaatcga 
2881 gaagttttgc ggcattctgg ctacacaata 
2941 ctactagctc aaaactcatt cacacacggc 
3001 tctagagcga tccgggatct cgggaaaagc 
3061 ctttaaaaac aaaaaacaat tactcagtgc 
3121 tgcctacatc acaacaaaaa ctgattcaac 
3181 tgaacattat cttgattata ttattgataa 
3241 gatgcctatc attggttgga atgaacttga 
3301 aggtaaacgc cattgtcagc aaattgaccc 
3361 aatgttaatt ctcgttgacc ctgagcactg 
3421 attaagttaa ggtggataca catcttgtca 
3481 cattaggcac cccaggcttt acactttaCg 
3541 agcggataac aatttcacac aggaaacagc 
3601 aaccctcact aaagggaaca aaagctggag 
3661 agtggatccc ccgggcatca gatcggctat 
3721 aatatgtaca tctacatcgg ctcatgecca 
3781 aetagttatt aatagtaatc aatcacgggg 
3841 cgcgttacae aactcacggt aaatggcccg 
3901 ttgacgtcaa taatgacgca tgcccccata 
3961 caatgggtgg agtaettacg gtaaactgcc 
4021 ccaagtapgc cccccactga cgtcaatgae 
4081 tacatgacct tatgggactt ccctacttgg 
4141 accaeggcga egcggttttg gcagtacatc 
4201 ggacetccaa gtctccaccc cactgacgtc 
4261 cgggaebttc caaaatgtcg taacaactcc 
4321 gtacggtggg aggtctatat aagcagagct 
4381 cgccatccac gctgttttga cctccataga 
4441 cgggaacggt gcattggaac gcggattccc 
4501 agactctata ggcacacccc tttggctcct 
4561 etatacaccc ccgcttcctt atgctatagg 



gtacatctae gtattagtca tcgctattac 
tgggcgtgga tagcggtttg actcacgggg 
tgggagtttg ttttggcacc aaaatcaacg 
cccattgacg caaatgggcg gtaggcgtgt 
tttagtgaac cgtcagatcg cctggagacg 
acaccgggac cgatccagcc tccgcggccg 
tgccaagagt gacgtaagta ccgcctatag 
gcatgctata ctgtttttgg cttggggcct 
atggtatagc ttagcctata ggcgcgggtt 
tgacgatact ttccattact aatccataac 
ctatatgcca, atactctgtc cttcagagac 
ggtcccattt attatttaca aattcacata 
tcttattaaa catagcgtgg gatctccacg 
cccctccccg gtagcggcgg agcttccaca 
tcatggtcgc tcggcagctc cttgctccta 
atgcccacca ccaccagtgt gccgcacaag 
gagcgtggag attgggctcg cacggctgac 
gaagatgcag gcagctgage tgtcgtattc 
gtgctgttaa cggtggaggg cagtgtagte 
accagaeata atagctgaca gactaacaga 
caccgtcgga ccatgtgcga actegatatc 
gaattacact taaaacgact caacagctta 
aaaactctca etettaeoga actcggccgt 
aacatcaaae gaatcgaeog attgttaggt 
gtatacegtc ggcaegctag ctttatccgc 
gactggtctg atateegtga gcaaaaacga 
cacggtogtt etgttaotet teatgagaaa 
aaagctcatg accaattcct agcogacett- 
cteattgtca gtgatgetgg etctaaagtg 
tggtactggt taagtcgagt aagaggaaaa 
tggaaaccta tcagicaaett acatgatacg 
aagaggctga ctaaaageaa tccaatctca 
aaaggcogaa aaaatcagcg ctcgacacgg 
tactcagcgc cggcaaagga gccatgggtt 
acacccaaac aacttgttaa tatctattcg 
gacttgaaaa gtcctgccta cggactaggc 
cgttttgata tcatgctgct aatcgccctg 
gttcatgctc agaaacaagg ttgggacaag 
aacgtactct caacagttcg cccaggcatg 
acaagggaag acttactcgt' ggctgcaace 
tacgctttgg ggaaattatg aggggatege 
gttggtgacc aaaggtgcct tttatcatca 
ctgttataag cagcMttaa ttatgattga 
aaacggttgg Cctgccttag aaagtatatt 
taataaaaac cttatcccta tccaagaagt 
aaaaaattag ccttgetatac attactggta 
aagagaacca acttaaagct tccctgacgg 
atgaatcccc taatgatttt ggtaaaaatc 
tatgatcccg gtaatgtgag ttagctcact 
cttccggctc gtatgttgtg tggaattgtg 
tacgaccatg attacgccaa gcgcgcaatt 
ctccaccgcg gtggcggccg ctctagaact 
tggccattgc atacgttgta tccatatcat 
acattaccgc catgttgaca ctgattattg 
tcattagcce atagcccata catggagtcc 
cctggctgac cgcceaacga cccccgccca 
gtaacgccaa tagggacttt ccaccgacgt 
cacttggcag cacaccaagt gcatcatacg 
ggtaaatggc ccgcctggca ttatgcccag 
cagtacatct acgtattagt catcgccatt 
aatgggcgtg gatagcggCC Cgacfccacgg 
aatgggagtt tgtcttggca ccaaaatcaa 
gccccattga cgcaaatggg cggtaggcgt 
cgtttagtga accgtcagat cgcctggaga 
agacaccggg accgatccag cccccgcggc 
cgtgccaaga gtgacgtaag taccgcctat 
atgcatgcta taccgttttt ggcttgggge 
tgatggtata gctcagccta taggtgtggg 
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4621 ttattgacca ttattgacca ctcccccatt 
46S1 acatggctct ttgccacaac tatctctatt 
4741 accgacacgg actctgtatt tttacaggat 
4801 tatacaacaa cgcogtcccc cgtgccogca 
4861 cgegaatctc gggtacgtgt tccggacatg 
4921 catccgagcc ctggtcccat gcctccagcg 
4981 taacagtgga ggccagactt aggcacagca 
S041 aggecgtggc ggtagggtat gcgtctgaaa 
SlOl acgcagatgg aagacttaag gcagcggcag 
5161 tctgaeaaga gtcagaggta actcccgttg 
5221 Cctgagcagt acccgttgcc gccgcgcgcg 
5281 gactgttccc ttccatgggc cttctctgca 
5341 geagcaagca tggaattttg ttttgatgta 
5401 gagaacatct cctactgccc cattgccatc 
5461 gcaaaagaca geaccaggac acaaataaat 
SS21 ttcsgagaca gtattgaagc tcagcgtggc 
SS81 gacatcctca accaaatcac caaaccaaat 
5641 ctttatgctg aagagagata cccaatcctg 
5701 tatagaggag gcttggaacc catcaacttt 
5761 atcaattcct gggtagaaag tcagacaaat 
5821 tccgtggact ctcaaactgc aat^ttctg 
5S81 gagaaagcat ttaaggatga agacacacaa 
5941 agcaaacctg tgcagatgat gtaccagatt 
6001 gagaaaatga agatcctgga gcttccattt 
6061 ttgcctgatg aagtctcagg ccttgagcag 
6121' actgaaegga ccagttctaa cgttatggaa 
6181 tgaagakgga ggaaaaatac aacctcacat 
6241 tgtttagctc ttcagccaat cCgtctggca 
6301 aagctgtcca tgcagcacat gcagaaatca 
6361 cagaggctgg' agtggatgct gcaagcgtct 
6421 tcttctgtat caagcacatc gcaaccaacg 
6481 cgcggccagb agatgacgca ccagcagatg 
6541 acgcaccag'c . agatgacgca acaacatgta 
6601 tgctggatga cgatgacaaa tttgtgaacc 
6661 ctctctacct agtgtgcggg gaaegaggct 
6721 cagaggacct gcaggtgggg caggtggagc 
6781 ageeettgge cctgg a gg g g teeetgeaga 
6641 gcatctgcte cctetaccag ctggagaaet 
690X atcgcggeeg etetagacca ggcgcctgga 
6961 gagetctaga gatctgtgtg ttggtetttt 
7021 caeetgttgt Ctgcceetcc ccegcgeett 
7081 teetttccta ataaaatgag gaaattgeac 
7141 tggggggtgg ggtggggcag cacagcaagg 
7201 etggsgatge ggtgggctct atgggtacct 
7261 ctctctcteg gtacctctcc tcgagggggg 
7321 egtattacgc gcgetcactg gccgtegttt 
7381 ttaeccaact taatcgcctt gcagcaeate 
7441 aggcccgcac egiatcgccct tcecaaeagt 
7501 aagegttaat attttgttaa aattcgegtt 
7561 eeaataggce gaaaccggea aaatceetta 
7621 gagtgttgtt ccagtttgga acaagagtcc 
7681 agggegaaaa accgtetate agggegatgg 
7741 tgtatecaec ttaacttaat gatttttacc 
7801 ^tcaacgag aabtaacatt ccgteaggaa 
7861 acteaatggc tggttatgca tatcgcaata 
7921 aaaaaaggce aatttattgc tatttaccgc 
7981 aatagatagg tcttatttga agctaaatct 
8041 atcaagaggg teattatatt tcgoggaata 
8101 ttgtctcctg tctactcccc tgagcttgag 
8161 taataataca gcaaaacget aaaccaataa 
8221 aatgattata aataacagca aacagtaatg 
8281 tcaccaataa tccctgtaaa gcaccttgct 
8341 eecgeaatgc aggtaaagcg atcccaccac 
8401 accaaccttc agatataaac gctaaaaagg 
8461 gcagtactgc cgttttttcg cccacttagt 
8521 tactgagtgt aaaagaccaa gacccgcaat 
8581 aogatttctg taatagcacc acaccgtgct 
8641 teaacaaatg geategttaa ataagtgatg 



ggtga^ata ccttccatta ctaatccaca 
ggctatatgc caatactctg tccttcagag 
ggggtcccac ttattattta caaattcaca 
gtttttatta aacatagcgt gggatctcca 
ggctcttctc cggtagcggc ggagctccca 
gctcatggte gctcggcagc cccttgctcc 
caatgcccac caccaccagt gtgccgcaca 
atgagcgtgg agattgggct cgcacggctg 
aagaagatgc aggcagctga gttgttgtat 
cggtgctgtt aacggtggag ggcagtgcag 
ccaccagaca taatagctga cagactaaca 
gtcaccgtcg gaCcaatggg ctccatcggt 
ttcaaggagc tcaaagtcca ccatgccaat 
atgtcagctc tagccatggt atacctgggt 
aaggttgttc gccttgataa acttccagga 
acatctgtaa acgttcaetc Ctcacttaga 
gatgtttacc cgctcagcet tgccagtaga 
ccagaatacC tgcagtgtgt gaaggaaetg . 
caaacagctg cagatcaagc cagagagctc 
ggaattatca gaaatgtcct tcagccaagc 
gttaatgcca ttgtcttcaa aggaetgtgg 
gcaatgcctt tcagagtgac tgagcaagiia 
ggtttattta gagtggcaCc aatggcttct 
gccagtggga caatgagcat gttggtgctg 
cttgagagta taatcaactt tgaaaaactg 
gagagaagat caaagcgtac ttacctegca-. 
ctgtcbtaat ggctatgggc attactgaegi 
tctcctcagc agagagectg aagatacetc 
atgaagcagg cagagaggtg gtagggtcag 
ctgaagaatt tagggctgac catccattcc 
ccgttctctt cttttggcag atgtgtttce 
acgcaccagc agatgacgca ccagcagatg 
tcctgaaagg ctcttgtggc tggatcggce. 
aacacctgtg cggctcacae etggtggaag 
tcttctacac acceaagaee cgecgggagg 
tgggcggggg ccctggtgca ggcagectga 
agogtggcat tgtggaacaa tgctgeaeca 
aotgcaacca gggcgcctaa agggcgaatt 
tccagatcac tcecggctaa taaaagatea 
gtggaCGtgo tgtgeettct agtcgccagc 
ccccgaocct ggaaggegcc acecccaceg 
cgcattgtot gagtaggtgt eattetacto 
gggaggattg ggaagaeaac agcaggeatg 
ctetetctct ctctetetet cteaotctct 
gcccggtaec eaatccgecc tatagtgagt 
taeaacgtcg tgactgggaa aaccetggeg 
cccctttegc cagceggegt aatagogaag 
tgcgcagcct gaatggcgaa tggaaaetgt 
aaatttttgt taaatoagct cattCtttaa 
taaatcaaaa gaatagaccg agatagggtt 
actattaaag aacgtggact ccaacgtciui 
cecactacte cgggatcata tgacaagatg 
aaaatcatta ggggattcat cagtgctcag 
agcttatgat gatgatgtgc ttaaaaactt 
catgcgaaaa acctaaaaga gcttgccgat 
ggctttttat tgagcttgaa agataaataa 
tctttatcgt aaaaaatgcc ctcctgggtt 
acatcatttg gtgaogaaat aactaagcac 
gggttaacat gaaggtcatc gatagcagga 
tccaaatcca gceatcecaa attggtagtg 
ggccaataac accggtcgca ttggtaaggc 
gatgactctt tgtttggata gacatcactc 
cagccaataa aattaaaaca gggaaaacta 
caaacgcact actatctgca ataaatccga 
ggctattctt cctgccacaa aggcttggaa 
gaaaagccaa ccatcatgct attcatcatc 
ggattggcta tcaacgcgct gaaataataa 
eatacegatc agetttegtt ccecttagtg 
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8701 agggttaatt gegcgcttgg cgtaatcatg gtcacagetg tttcctgtgt gaaatcgtta 
8761 tccgcccaca ateccacaca acatacgagc eggaageata aagtgtaaag ectggggtge 
BB21 ctaatgagcg agctaactca cattaactgc gtcgegetca ctgccegctt cccagteggg 
8881 aaaectgtcg tgecagctgc attaatgaat eggeeaaege gcggggagag geggttegeg 
8941 tattgggcgc tcttccgctt cctcgctcac tgactcgctg cgetcggtog ttoggetgcg 
9001 gogagoggta teagctcaet caaaggeggt aatacggtta tccacagaat. eaggggataa 
9061 egeaggaaag aacatgtgag eaaaaggcca gcaaaaggee aggaaccgta aaoaggcogc 
9121 gttgctggeg ttttcccata ggctcogccc ccctgacgag catcaeaaaa atcgaogctc 
9181 aagtcagagg tggcgaaace cgaca^acC ataaagatac caggcgttte eecetggaag 
9241 ctccetegtg cgctetcctg ttc^accct gccgcttaec ggatacctgt cegcctttct 
9301 eccttcsgga agcgtggcgc tttctcatag ctcacgctgt aggtacctea gtteggtgta 
9361 ggtegttcgc tceaagctgg gctgtgtgca cgaacccccc gttcagcccg aecgetgcge 
9421 cttatccggt aactatcgtc ttgagtccaa ccoggtaaga cacgaettat cgccactggc 
9481 agcagccact ggtaacagga ttagcagage gaggtatgta ggcggtgeta eagagttett 
9S41 gaagtggtgg cctaactacg gcbacactag aaggacagta tttggtatct gegetctgct 
9601 giiagccagtt accttcggaa aaagagttgg Cagctctcga tccggcaaae aaaecaccgc 
9661 tggtagcggt ggtttfctttg tttgcaagca gcagattacg cgcagaaaaa aaggatctca 
9721 agaagatcct btgatctttt ctacggggtc tgacgctcag tggaacgaaa aetcacgtta 
9781 agggattttg gtcatgagat tatcaaaaag gatcttcacc tagatccttt taaattaaaa 
9841 atgaagtttt aaatcaatct aaagtatata tgagtaaact tggtctgaca gttaccaatg 
9901 . cttaatcagt gaggcaccta tctcagcgat ctgtctattt cgttcatcca tagttgcctg 
9961 actcccegtc gtgtagataa ctacgatacg ggagggctta ccatctggcc ccagtgctgc 
10021 aatgataccg cgagacccac gctcacoggc tccagattta tcagcaataa accagccagc 
local cggaagggcc gagcgcagaa gtggtcctgc aactttatcc gcctccatc'c agtctattaa 
10141 ttg'ttgccgg gaagctagag taagtagttc gccagttaat agtttgcgca acgttgttgc 
10201 cattgccaca ggcatcgtgg tgtcacgctc gtcgttcggt atggcttcat tcagctccgg 
10261 Ctcccaacga tcaaggcgag tt'acatgacc ccccatgttg tgcaaaaaag cggttagctc 
10321 cctcggicct ccgatcgttg tcagaagtaa gttggccgca gtgttatcac tcatggttat 
10381 ggcagcactg cataactctc ttactgtcat gccatccgta agatgctctt ctgtgactgg 
10441 tga'gtactca accaagtcat tctgagaata gtgtatgcgg cgaccgagtt gctcttgccc 
lOSOl ggcgtcaata cgggataata ccgcgccaca tiagcagaact ttaaaagtgc tcatcattgg 
10561 aaaacgctct tcggggcgaa aactctcaag gatcttaccg ctgttgagat ccagttcgat 
10621 gtaacccact cgtgcaccca actgatcttc agcatctttt actttcacca gcgtttctgg 
10681 gtgagc'aaaa acaggaaggc aaaatgccgc aaaaaaggga ataagggcga cacggaaatg 
10741 ctgaatactc atactcttcc tctttcMta ttattgaagc atttatcagg gttattgtct 
10801 cat'gagcgga tacatatetg aatgtattta gaaaaataaa caaatagggg ttccgcgcac 
10861 atttccecga aaagtgccac 



SEQ ID NO: 48 (cecropln prepro) 

ART TTC TCA ASG ATA TTT 

TTC TTC GT3 TTC GCT TTG 

GTT CTQ GCT TC6 TCA ACA 

GTT TCQ GCT 6CG CCA SAG 

CCG AAA 



SBQ ID NO: 49 (cecropln 
prepro extended) 
AAT TTC TCA AGG ATA TTT 
TTC TTC GTG TTC GCT TTG 
GTT CTG GCT TTG TCA ACA 
GTT TCG GCT GCG CCA GAG 
CCG AAA TGQ AAA GTC TTC 
AAG 



SEQ ID NO:50 (cecropin pro) 
GCG CCA GAG CCG AAA 

SEQ ID N0:51 (cecropin pro extended) 

GCG CCA GAG CCG AAA TGG AAA GTC TTC AAG 

SEQ ID NO;52 (a Kozak sequence) 
ACCATGT 
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Claims 

1. Method of producing proteins, polypeptides or peptides comprising 

5 (i) administering a composition to an oviduct or an ovary of a bird, wlierein tlie composition comprises a trans- 

poson-based vector which comprises: 

a) a transposase gene operably linlced to a first promoter, the transposase gene encoding for a transposase; 
and 

'0 b) one or more genes of interest operably-linked to one or more additional promoters; wherein the one or 

more genes of interest and their operably-linked promoters are flanked by transposase insertion sequences 
recognized by the transposase, and wherein the first promoter comprises a modified Kozak sequence 
comprising ACCATG (SEO ID N0:1) and 

15 (ii) permitting the one or more genes of interested to be expressed into a protein, a polypeptide or a peptide. 

2. The method of claim 1 , wherein the composition is to be injected into an artery leading to the oviduct or the ovary. 

3. The method of claim 1 , wherein the composition is to be injected into a lumen of the oviduct. 

20 

4. The method of claim 1 , wherein the composition further comprises a transfection reagent. 

5. The method of claim 1, wherein one to twenty codons at a beginning of the transposase gene are modified by 
changing a nucleotide at a third base position of the codon to an adenine or thymine without modifying the amino 

25 acid encoded by the codon. 

6. The method of claim 1 , wherein the transposon-based vector comprises: 

a) a transposase gene operably-linked to a first promoter and an avian optimized polyA sequence, the trans- 
30 posase gene encoding for a transposase; and 

b) one or more genes of interest operably-linked to one or more additional promoters; 

c) wherein the one or more genes of interest and their operably-linked promoters are flanked by transposase 
insertion sequences recognized by the transposase. 

35 7. The method claim 6, wherein the first promoter is a constitutive promoter. 

8. The method of claim 6, wherein the first promoter is an oviduct-specific promoter selected from the group consisting 
of ovalbumin, ovotransferrin, ovomucoid, ovomucin, g2 ovoglobulin, g3 ovoglobulin, ovoflavoprotein, and ovostatin. 

40 9. The method of claim' 6, wherein the one or more gene of interest is operably-linked to a second promoter. 

10. The method of claim 9, wherein the second promoter is an oviduct-specific promoter selected from the group 
consisting of ovalbumin, ovotransferrin, ovomucoid, ovomucin, g2 ovoglobulin, g3 ovoglobulin, ovoflavoprotein, and 
ovostatin. 

45 

11. The method of claim 6, wherein the transposon-based vector further comprises an egg directing sequence or an 
enhancer operably-linked to the one or more genes of interest. 

12. The method of any of claims 1 to 11 , wherein the animal is a poultry bird. 

50 

1 3. The method of any of claims 1 to 1 2, wherein the transposase is a Tn 1 0 transposase. 



Patentanspruche 

55 

1. Verfahren zur Herstellung von Proteinen, Polypeptiden oder Peptiden umfassend 

(i) Verabreichen einer Zusammensetzung an ein Ovidukt oder sin Ovar eines Vogels, wobei die Zusammen- 
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setzung einen Transposon-basierten Vektor umfasst, welcher umfasst: 

a) ein Transposase-Gen operativ verknupft mit einem ersten Promotor, wobei das Transposase-Gen fur 
eine Transposase kodiert, und 

b) eines oder mehrere Gene von Interesse operativ verknupft mit einem oder melireren zusatzliclien Pro- 
motoren, wobei das eine oder die mehreren Gene von Interesse undlhre operativ verknupften Promotoren 
flanklert sind von Transposase-lnsertionssequenzen, welche von der Transposase erkannt werden und 
wobei der erste Promotor eine modifizierte Kozak-Sequenz umfassend ACCATG (SEQ ID N0:1) umfasst 
und 

(ii) Zulassen dass das eine oder die mehreren Gene von Interesse zu einem Protein, einem Polypeptid oder 

einem Peptid exprimiert werden. 

2. Das Verfahren nach Anspruch 1 , wobei die Zusammensetzung in eine zum Ovidukt oder zum Ovar fuhrende Arterie 
zu injizieren ist. 

3. Das Verfahren nach Anspruch 1 , wobei die Zusammensetzung in ein Lumen des Ovidukts zu injizieren ist. 

4. Das Verfahren nach Anspruch 1 , wobei die Zusammensetzung ferner ein Transfektionsreagens umfasst. 

5. Das Verfahren nach Anspruch 1 , wobei ein bis zwanzig Kodons am Anfang des Transposase-Gens modifiziert sind 
durch Verandern eines Nukleotids in einer dritten Basenposition des Kodons zu einem Adenin oder Thymin ohne 
Verandern der Aminosauresequenz, welche durch das Kodon kodiert ist. 

6. Das Verfahren nach Anspruch 1 , wobei der Transposon-basierte Vektor umfasst: 

a) ein Transposase-Gen operativ verknupft mit einem ersten Promotor und einer optimierten Vogel-PolyA- 
Sequenz, wobei das Transposase-Gen fiir eine Transposase kodiert, und 

b) eines oder mehrere Gene von Interesse, operativ verknupft mit einem oder mehreren zusatzlichen Promo- 
toren; 

c) wobei das eine oder die mehreren Gene von Interesse und ihre operativ verknupften Promotoren flankiert 
sind von Transposase-lnsertionssequenzen, welche von der Transposase erkannt werden. 

7. Das Verfahren nach Anspruch 6, wobei der erste Promotor ein konstitutiver Promotor ist. 

8. Das Verfahren nach Anspruch 6, wobei der erste Promotor ein Ovidukt-spezifischer Promotor ist, welcher ausgewahit 
ist aus der Gruppe bestehend aus Ovalbumin, Ovotransfemn, Ovomucoid, Ovomucin, g2-Ovoglobulin, g3-0voglo- 
bulin, Ovoflavoprotein und Ovostatin. 

9. Das Verfahren nach Anspruch 6, wobei das eine oder die mehreren Gene von Interesse mit einem zweiten Promotor 
operativ verknupft sind. 

10. Das Verfahren nach Anspruch 9, wobei derzweite Promotor ein Ovidukt-spezifischer Promotor ist, welcher ausge- 
wShlt ist aus der Gruppe bestehend aus Ovalbumin, Ovotransferrin, Ovomucoid, Ovomucin, g2-Ovogloburm, gS- 

Ovoglcbuiin, Ovoflavoprotein und Ovostatin. 

1 1 . Das Verfahren nach Anspruch 6, wobei der Transposon-basierte Vektorferner umfasst eine Ei-dirigierende Sequenz 
Oder eInen Enhancer operativ verknQpft mit dem einen oder den mehreren Genen von Interesse. 

12. Das Verfahren nach einem der Anspruche 1 bis 1 1, wobei das Tier ein Federvieh ist. 

13. Das Verfahren nach einem der Anspruche 1 bis 12, wobei die Transposase eine Tn10-Transposase Ist. 



Revendicatlons 

1. Procede de production de proteines, de polypeptides ou de peptides comprenant 
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(i) radministration d'une composition au niveau d'un oviducte ou d'un ovaire d'un oiseau, la composition com- 
prenant un vecteur a base de transposons comprenant 

a) un gene d'une transposase en liaison fonctionnelle avec un premier promoteur, ledit gene d'une trans- 
posase codant pour una transposase ; et 

b) au moins un gene d'interet en liaison fonctionnelle avec au moins un promoteur supplementaire ; lesdits 
au moins un gene d'interet et leurs promoteurs en liaison fonctionnelle etant flanques par des sequences 
d'insertion de la transposase reconnues par la transposase, et le premier promoteur comprenant una 
sequence da Kozak modifiee comprenant ACCATG (SEQ ID NO : 1 ) et 

(ii) I'expression dudit au moins un g^ne d'int^rSt dans una prot^lna, un polypeptide ou un peptide. 

2. Procede selon la revendication 1 , dans lequel la composition est destinee a etre injectee dans une artere conduisant 
jusqu'a I'oviducte ou ovaire. 

3. Procede salon la revendication 1 , dans laqual la composition est destinee a etre injectee dans la lumiere de oviducte. 

4. Procede selon la revendication 1 , dans lequel la composition comprend, en outre, un reactif de transfection. 

5. Procede salon la revendication 1 , dans lequel de un a vingt codons situes au debut du gene de la transposase sont 
modifies par eciiange d'un nucleotide au niveau de la position de la troisieme base du codon centre une adenine 
ou une thymine sans modification de I'acide amine encode par le codon. 

6. Procede selon la revendication 1 , dans lequel le vecteur a base de transposons comprend : 

a) le gene d'une transposase en liaison fonctionnelle avec un premier promoteuret une sequence polyAavialre 
optimises, ledit gene d'une transposase codant pour une transposase ; et 

b) au moins un gene d'interet en liaison fonctionnelle avec au moins un promoteur supplementaire ; 

c) lesdits au moins un gene d'interet et leu rspromoteursen liaison fonctionnelle etant flanques par des sequences 
d'insertion d'une transposase reconnues par la transposase. 

7. Procede selon la revendication 6, dans lequel le premier promoteur est un promoteur constitutif. 

8. Procede selon la revendication 6, dans laqual la premier promoteur est un promoteur sp^cifiqua de I'oviducte choisi 
dans le groups constitue de I'ovalbumine, de I'ovotransfemne, de I'ovomuco'i'de, de I'ovomucine, de I'ovoglobuline 
g2, de I'ovoglobuline g3, de I'ovoflavoproteine et de I'ovostatine. 

9. Procede selon la revendication 6, dans lequel I'au moins un gene d'interet est en liaison fonctionnelle avec un 
second promoteur. 

10. Procede selon la revendication 9, dans lequel le second promoteur est un promoteur specifique de I'oviducte cfioisi 
dans le groupe constitue de I'ovalbumine, de I'ovotransfen-ine, de i'ovomucoTde, de I'ovomucine, de I'ovoglobuline 
g2, de I'ovoglobuline g3, de rovoflavoproteine et de I'ovostatine. 

11. Procede selon la revendication 6, dans lequel le vecteur a base de transposons comprend, en outre, une sequence 
de contrdle de I'oeuf ou un amplificateur en liaison fonctionnelle avec I'au moins un gene d'interet. 

12. Proc6d6 selon I'une quelconqus das revendlcations 1^11, dans lequel I'animal est una volaille. 

13. Procede selon I'une quelconque des revendlcations 1 a 12, dans lequel la transposase est une transposase Tn10. 
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